They may wish to consider using different SNAT pools for employees vs. data-center bots to avoid the confusion. Perhaps name the PTR for employee pools something like employee-snat-1000.building.region.fb.net. If the bot SNAT pools have unique names and CIDR SWIP delegation it would also be easier to identify bots regardless of user-agent and easier to know who to complain to.
Alternately require employees to use a dirty-net VPN to view porn or whatever edgy / taboo content so it does not even come from meta allocated networks but is still routed through their security devices. As apposed to the consumer commercial VPN's in the spirit of preserving the "top-10 porn viewer awards. If anyone asks about the dirty net VPN it is for researching malware, ransomware and sites posing as meta/fb.
Employees shouldn’t be using the corporate network to view porn. Everyone nowadays (especially Meta employees) has smartphones with personal hotspot capability. Just use that if you really need access to such content in the office.
In fact I don’t really buy their reasoning - surely everyone working as a software engineer knows that SNIs are unencrypted and any corporate network will log them, so nobody should be accessing any kind of objectionable content via the corporate network for their own good to avoid embarrassment.
So I suspect the porn access may have been due to an actual AI scraper (not arguing that they trained on porn - it could very well be there are 2 stages - one is a basic scraper more or less following links, including accidentally following a pornographic link posted on a non-porn site, and another stage classifies the content and the decides whether it should be used for training or discarded).
If this was actually employees accessing or torrenting pornographic content, I’d expect this to be quickly followed up on and the employee facing dismissal after a stern warning (or an IT security investigation if they persevere in claiming they didn’t do it, as it could suggest malware using their machine to proxy traffic).
But I don’t think an occasional user being sloppy would be grounds for a lawsuit. So I suspect a scraper did indeed go haywire and downloaded porn en-masse.
Whether they trained on it is another matter. I suspect that due to the prevalence of porn on the internet there would be incentives to filter out such content (if anything, just because it’s quantity would outweigh the non-porn content) for model training.
> “[T]he small number of downloads—roughly 22 per year on average across dozens of Meta IP addresses—is plainly indicative of private personal use, not a concerted effort to collect the massive datasets Plaintiffs allege are necessary for effective AI training,” Meta writes.
> Meta: Pirated Adult Film Downloads Were for "Personal Use," Not AI Training
Do they have a statistic how much they "used" them ?
Would be nice to know what kind of adult films Meta "employees" prefer ? BSDM ? Shemale ? Lesbian ? Gay ? /s
They may wish to consider using different SNAT pools for employees vs. data-center bots to avoid the confusion. Perhaps name the PTR for employee pools something like employee-snat-1000.building.region.fb.net. If the bot SNAT pools have unique names and CIDR SWIP delegation it would also be easier to identify bots regardless of user-agent and easier to know who to complain to.
Alternately require employees to use a dirty-net VPN to view porn or whatever edgy / taboo content so it does not even come from meta allocated networks but is still routed through their security devices. As apposed to the consumer commercial VPN's in the spirit of preserving the "top-10 porn viewer awards. If anyone asks about the dirty net VPN it is for researching malware, ransomware and sites posing as meta/fb.
Employees shouldn’t be using the corporate network to view porn. Everyone nowadays (especially Meta employees) has smartphones with personal hotspot capability. Just use that if you really need access to such content in the office.
In fact I don’t really buy their reasoning - surely everyone working as a software engineer knows that SNIs are unencrypted and any corporate network will log them, so nobody should be accessing any kind of objectionable content via the corporate network for their own good to avoid embarrassment.
So I suspect the porn access may have been due to an actual AI scraper (not arguing that they trained on porn - it could very well be there are 2 stages - one is a basic scraper more or less following links, including accidentally following a pornographic link posted on a non-porn site, and another stage classifies the content and the decides whether it should be used for training or discarded).
If this was actually employees accessing or torrenting pornographic content, I’d expect this to be quickly followed up on and the employee facing dismissal after a stern warning (or an IT security investigation if they persevere in claiming they didn’t do it, as it could suggest malware using their machine to proxy traffic).
Someone probably got sloppy with wifi on their cell phone.
But I don’t think an occasional user being sloppy would be grounds for a lawsuit. So I suspect a scraper did indeed go haywire and downloaded porn en-masse.
Whether they trained on it is another matter. I suspect that due to the prevalence of porn on the internet there would be incentives to filter out such content (if anything, just because it’s quantity would outweigh the non-porn content) for model training.
> “[T]he small number of downloads—roughly 22 per year on average across dozens of Meta IP addresses—is plainly indicative of private personal use, not a concerted effort to collect the massive datasets Plaintiffs allege are necessary for effective AI training,” Meta writes.
Not exactly scraper-level downloading.
> Meta: Pirated Adult Film Downloads Were for "Personal Use," Not AI Training
Do they have a statistic how much they "used" them ? Would be nice to know what kind of adult films Meta "employees" prefer ? BSDM ? Shemale ? Lesbian ? Gay ? /s