Nearly 45 GB of source code files allegedly stolen by a former employee have revealed the foundation for many of Russian tech giant Yandex’s apps and services. It also revealed important ranking factors for Yandex’s search engine, which were largely unpublished.
“Yandex git source‘ was posted as a torrent file on January 25th, and the show file appears to have been filmed in July 2022, dating back to February 2022. Software Engineer Arseniy Shestakov said: He said he confirmed with current and former Yandex employees that some archives “certainly contain the latest source code for corporate services.” Yandex told security blog BleepingComputer That “Yandex was not hacked”, that the leak was from a former employee. Yandex said it “did not see any threat to user data or platform performance.”
The files specifically date back to February 2022, when Russia launched a full-scale invasion of Ukraine. A former Yandex executive told his BleepingComputer that the leak was “political,” and the former employee said he wasn’t trying to sell the code to Yandex’s competitors. No anti-spam code was leaked either.
It is not clear whether the publication of Yandex’s source code has any security or structural implications. 1,922 ranking factors in Yandex’s search algorithm It certainly makes waves. SEO consultant Martin MacDonald explained the hack on Twitter “Probably the most interesting thing in SEO in the last few years” (As pointed out by Search Engine Land). In a thread detailing some of the more notable factors, Researcher Alex Brax “There is a lot of useful information for Google SEO,” he said.
Yandex, the fourth largest search engine by volume, allegedly employs several former Google employees. Yandex tracks many of Google’s ranking factors identifiable in its code and competes fiercely with Google. Google’s Russian division recently filed for bankruptcy After losing bank accounts and payment services. Buraks notes that the first factor on Yandex’s list of ranking factors is “PAGE_RANK”, which at first glance is A basic algorithm created by the co-founders of Google.
As detailed by Buraks ( 2 thread), the Yandex engine gives preference to pages that:
- not too old
- High organic traffic (unique users) and low search traffic
- Fewer numbers and slashes in URLs
- Use “PR=0” to optimize code instead of “hard pesmization”
- Hosted on a trusted server
- It happens to be a Wikipedia page or linked from Wikipedia
- Hosted or linked from a higher level page on your domain
- Include keywords in URL (up to 3)
You can search and click all the elements of Rob Ousbey. compiled search toolYou may notice that around 1,000 ranking factors are tagged with “TG_DEPRECATED” and over 200 are listed as “TG_UNUSED”. The code is from February 2022 and was acquired in July 2022, so Yandex searches have definitely changed since then. But this leak provides rare information about how search rankings are compiled on a site serving one of the world’s largest countries.
Yandex previously saw a former employee dump the search engine’s code in 2015. Tried to sell for $28,000 on the black market to fund his own startup. The shockingly low numbers in the core code of Yandex’s flagship product suggested he didn’t realize its true value. The employee said he was sentenced to two years in prison and the code was never seen publicly.