AI Patterns and Trends
The following summary comes from Databricks.com's article entitled: "Data + AI in the real world."
----------------------------------
This summary is from the above-noted article.
There's been much discussion over the past six months about AI (artificial intelligence-generated prose) since ChatGPT was released. "The unparalleled pace of AI discoveries, model improvements, and new products on the market puts data and AI strategy at the top of conversations across every organization in the world." (1)
Colleges and universities, companies, and developers who understand and use AI rather than denounce it clearly realize that they have to develop policies to adhere to and develop before regulation impedes the process.
PATTERNS and TRENDS
- Companies are adopting machine learning (ML) and Large Language Models (LLM) at a very rapid pace. Natural Language Processing (NLP) is dominating "use cases," with an accelerated focus on LLMs.
- Organizations are investing in data integration products as they prioritize more DS/ML (Deep Learning/Machine Language) initiatives. And this is where you need to research if you buy stock. "Deep learning is one of many approaches to ML. It implements an Artificial Neural Network (ANN), which has multiple layers between its input and output layers. The “deep” in deep learning refers to the many layers in a network that allow for more complex processing. Check out 3Blue1Brown’s* video on neural networks." (2)
- Databricks.com offers an application called Lakehouse for data warehousing as evidenced by the high growth of data integration, combining data residing in different sources and providing users with a unified view of them.[3] This process becomes significant in various situations:
- including both commercial (such as when two similar companies that need to merge their databases)
- and scientific (combining research results from different bioinformatic repositories, for example) domains.
- Data integration appears with increasing frequency as the volume (that is, Big Data) and the need to share existing data explodes. [4]
- Databricks offers its tools dbt and Firetran due to the accelerated development process of Databricks SQL.
- Companies using LLMs (for services like ChatGPT) has grown 1310% between November 2022 and May 2023
- NLP accounts for 49% of data science library usage, making it the most popular application in the market
- Organizations are putting substantially more models into production (411%) while also increasing their ML experimentation (54%)
- Organizations are getting more efficient with ML; for every three experimental models, roughly one is put into production, compared to five experimental models in prior years.
- BI is the top data and AI market, but growth trends in other markets show that companies are increasingly looking at more advanced data use cases
- The fastest-growing data and AI product is dbt, which grew 206% by number of customers ( A future blog will be about "dbt")
- Data integration is the fastest-growing data and AI market on the Databricks Lakehouse with 117% growth
- 61% of customers migrating to Databricks Lakehouse are coming from onprem and cloud data warehouses
- The volume of data in Delta Lake has grown 304%
- The Lakehouse is increasingly being used for data warehousing, including serverless data warehousing with Databricks SQL, which grew 441%
(1) From Databricks.com's article entitled: "Data + AI in the real world."
(2) https://medium.com/ds3ucsd/explaining-the-terms-ai-ml-dl-ds-b0ac43e99f5
(3) ^ mikben. "Data Coherency - Win32 apps. docs.microsoft.com. Archived from the original on 2020-06-12. Retrieved 2020-11-23.
(4) Frederick Lane (2006). "IDC: World Created 161 Billion Gigs of Data in 2006". Archived from the original on 2015-07-15



Comments