Lesson#4: Wikipedia vs AI (Part 2)
This is the second part of a summary of an article in the New York Sunday Times Magazine (7/23/23) about how Wikipedia is being used by AI chatbots for their current results and success.
Selena Deckelmann, Wikipedia's Chief Technology Officer recently said, "We've (Wikipedia) had artificial intelligence tools and bots since 2002, and we've had a team dedicated to machine learning since 2017. They are precious for semiautomated content review, and especially for translations."
The reciprocal nature of Wikipedia, bots, and AI are all very different from each other. It is also a known fact that AI systems have been trained on Wikipedia content for years. The term "scrape" means that engineers can figure out how to extract information from websites.
Jesse Dodge, a computer scientist at the Allen Institute for AI told Gertner (the article's author) that Wikipedia might now make up 3-5% of the scraped data that Large Language Models (LLM), or AI Chatbots, are using for training purposes.
As Jimmy Wales, Wikipedia co-founder said that Wikipedia's fundamental goal is to spread knowledge as broadly and freely as possible, by whatever means. And therein lies the rub. The question has to be asked will AI Chatbots make Wikipedia obsolete? Wikipedia's open platform allows it to be used by nanotechnology companies for whatever reason they choose.
The article's author, Jon Gertner, said that "No one I spoke with in the tech community seemed to know if it would even be possible to build a good AI model without Wikipedia."
In April, Stanford University scientists evaluated four AI engines --- they were Bing Chat, NeevaAI, perplexity.ai, and YouChat. They found that only about half of the sentences generated by these AI search engines could be fully supported by actual facts. They said, "We believe that these results are low for systems that may serve as a tool for information-seeking users."
Margaret Mitchell, a computer scientist who studied the ethics of AI at Google, said, "The state-of-the-art until now has just been a laissez-faire data approach. You just throw everything in, and you're operating with a mindset that the more data you have, the more accurate your system will be."
Gertner says that the road ahead will almost certainly lead to improvements, and Mitchell told him that she foresees AI companies making gains in accuracy and reducing biased answers by using better data.
--- End of Wikipedia vs. AI Part 2 of 4 ---
Source: New York Times Magazine Article: Moment of Truth - Can Wikipedia help teach AI chatbots to get their facts right, Jon Gertner, 7.23.23, page 34.



Comments