The internet contains vast amounts of information that does not show up on Google searches. The so-called “Deep Web” -contains data that is not indexed by search engines.
But all that information could soon become accessible to law enforcement agencies and scientists under a program being developed by US The Defense Advanced Research Projects Agency (DARPA).
Researchers at NASA’s Jet Propulsion Laboratory in Pasadena, California, have joined the program, hoping it will help catalogue the vast amounts of data NASA spacecraft deliver on a daily basis.
“We’re developing next-generation search technologies that understand people, places, things and the connections between them,” said Chris Mattmann, principal investigator for JPL’s work on Memex.
Memex checks not just standard text-based content online but also images, videos, pop-up ads, forms, scripts and other ways information is stored to look at how they are interrelated.
“We’re augmenting Web crawlers to behave like browsers – in other words, executing scripts and reading ads in ways that you would when you usually go online. This information is normally not catalogued by search engines,” Mattmann said.
Memex can even recognise what’s in videos and pair it with searches on the same subjects. The search tool could identify the same object across many frames of a video or even different videos.
All of the code written for Memex is open-source.
Bill Condie is a science journalist based in Adelaide, Australia.
Read science facts, not fiction...
There’s never been a more important time to explain the facts, cherish evidence-based knowledge and to showcase the latest scientific, technological and engineering breakthroughs. Cosmos is published by The Royal Institution of Australia, a charity dedicated to connecting people with the world of science. Financial contributions, however big or small, help us provide access to trusted science information at a time when the world needs it most. Please support us by making a donation or purchasing a subscription today.