“We specialise in eCommerce recruitment, which as you can imagine is pretty niche!! How do you see data science and ChatGPT supporting niche businesses?” – Holly
I think that tight niches are where data science really shine actually. ChatGPT is a large language model – it is a foundational model. But the real power comes in building smaller niche models on top of a foundational model that can truly appear like an expert in a particular space. ChatGPT right now doesn’t feel like an expert – it feels like a regurgitator of Wikipedia and other online content. But when we are training language models in the really narrow niches, it can feel like there is a lot more expertise there.
In recruitment in particular, a couple use cases. You’re probably searching for very specific experience that might not be a search filter on a job listing board. But if you could put together some resumes of your ideal candidate and train a model with that, it could pore through incoming resumes and score them on quality. Save countless hours of sifting through resumes, or countless dollars on paying a recruiter. This isn’t ChatGPT, but it is machine learning and data science.
I could also see the first interview being done by a chatbot. It could ask all the disqualifying questions, as well as output a DISC score or some other personality test data. Weed out the candidates upfront.
“Aside from technical questions and creative prompts, will ChatGPT be able to adapt to emotional prompts, and be able to provide a caring, supportive response?” – Matthew
In a word – unlikely. ChatGPT specifically is focused on trying to assemble perceived facts. When you’re training a large language model, it is difficult to prevent emotional content from seeping in, but I believe that they are trying. And that’s in an effort to avoid embarrassing output like we some from Bing recently – https://www.npr.org/2023/03/02/1159895892/ai-microsoft-bing-chatbot. Unfortunately, there is a lot more negative sentiment online versus positive, so if anything, ChatGPT’s just trying not to be mean at this point.
I could certainly however see other language models that focus on this specifically. If you took a dataset of therapist or suicide hotline transcripts, for example, and trained a model based on those, then that’s going to be a very caring and supportive model.
In fact – someone please build that. An AI-based therapist app could be a killer use case – TherAIpist?
“Love this format for an AMA! I’ll bite…as the CEO of Barefoot Solutions , how do you see AI tools affecting the software and app development space over time?” – Bobby
Specifically in the app development space, we’re already seeing tools pop up that are using large models in some of the different practices involved. One of the first mainstream tools was GitHub Copilot – https://github.com/features/copilot – which serves up smart suggestions for code completion. We are seeing this in UI/UX design too with tools like Galileo – https://www.usegalileo.ai/ – which which will generate a UI for you based on a natural language prompt. In terms of the next phase however, I envision smaller models that are trained on inputs for the specific user, or organization that the user works for. Copilot will suggest best practices, but what about best practices for your particular company? That’s where the space is going I believe, and will be the difference between good and great suggestions. This is also why ChatGPT often returns decent but not excellent outputs – the models are just too large to be ultra-precise.
“Looking at the next 5-10 years, what do you see as the biggest changes that companies will have to address in their data to make it more oriented towards LLMs while serving their customers? What can organizations do to address those large-scale data requirements now, while also addressing current state data science requirements?” – Corban
Corban I’d say right now the key for LLM’s specifically is to store everything, as structured text data when possible. So this would be things like extending email retention policies, automating meeting notes – https://fireflies.ai/ – call transcriptions, support conversations, etc. These are what the LLM’s and subsequent SLM’s (small language models – not sure if that’s a thing but it should be) need for training. So even if you don’t have a use for it now, you will be glad you stored it at some point.
“How fun! Okay… How many of your responses here were written by ChatGPT? 😂
More seriously, who should have intellectual ownership over content generated or augmented by AI? Does the set of AI training data used that leads to AI answers affect that ownership?” – Michael
** Disclaimer **
Precisely none of these responses were “Generated” 🙂
This gets tricky. In principle, the owner of the model should also own the output of that model. But as you mentioned, the training data can impact this. If the model creator has license or just public rights to the data that it used for training, then I think this principle still stands. But if the model trained on content that it didn’t have rights to, then I believe this will need to be some new form of copyright infringement that doesn’t exist yet. I expect a Supreme Court case coming in the next few years that will set a precedent for this. Actual legislation will likely take much longer.
I’m no attorney however. Nick Transier or Gabriel Levine – care to weigh in here?
“Oo love an AMA! As a marketer I am interested in how you think tools like ChatGPT could affect content generation, beyond what we’re already seeing?” – Naomi
We’re calling this an AAMA. That’s Asynchronous Ask Me Anything. Different than doing it live because it gives me the time to research answers when I don’t know enough about something 🙂
Right now, ChatGPT is not good enough to produce strong marketing content. At best, it can create a skeleton or foundation of content, that will then need to be edited by a human in order to generate something of high quality. This will improve over time and as more specialized models make it into the wild, I think it will be very difficult for a human to discern the difference.
I do think however that the FTC will move to enforce some regulations around AI-generated content. Just as we have to label ads as “Sponsored” now, I think that eventually we will have to label generated content as such. People will quickly dismiss “Generated” content, emails, etc. once they are labeled.
I suspect that search engines and spam filters will move much more quickly. There are already tools to identify generated content, like this one put out by OpenAI – https://platform.openai.com/ai-text-classifier. This will not be an arms race, as ChatGPT is not going to actively attempt to be undetectable, while these detection tools will get better and better.
So to sum things up, I think in the short run, a lot of marketers will be experimenting with ChatGPT and other large language models. But I don’t think it will be very successful. At best, I think they will be able to be used as a starting point, but will always need editing by a human for at least one of the reasons listed above.
“What pre-existing streams of revenue in the tech industry do you foresee being negatively impacted (if at all) by the introduction of generative AI?” – Matthew
I think it’s mostly upside Matt, at least at this point. None of the AI tools out there today produce what I would consider production-grade outputs. Typically there still needs to be a round of human edits before something is released into the wild in any sort of production application. So if anything it’s just going to introduce a lot of efficiency into the industry. Take design for instance. UI/UX designers right now are definitely not able to use generative AI to complete the design of a mobile app. But they can use it practically today to create initial skeletons, certain assets, and basic layouts. This allows more time for creativity and working on the harder stuff.
Digital ad targeting, programming, and all sorts of other digital skillsets will just be augmented, not replaced. We are a long way off from the AI being good enough to take the jobs of tech workers.
Fast forward 10-20 years and we’re definitely going to see some disruption in particular industries. Long-haul truckers is an example you see a lot. That job probably won’t exist in 10-20 years. Instead, the job will be sitting behind a dashboard monitoring the performance and safety of a fleet of trucks being driven autonomously.
“I’ve been loving using chatGPT for brainstorming and thought starters, been pretty impressed with what it comes up with! I’ve been wondering how it may impact SEO and whether search engines will evolve to allow for it or restrict.” – Melissa
I predict that search engines will restrict its importance greatly, nearly down to 0. Detection tools are already coming out and they will only get better.
Even moreso, they’re going to take over the search engines. ChatGPT doesn’t train on AI-generated content. That would introduce tremendous, self-affirming bias into the model. And because LLM’s are going to run the search engines of tomorrow (more like this afternoon), the will deem the generated content as pretty worthless. They need content produced by humans to function.
There is a gray area in here too, which is content originally created by an LLM but then edited to some degree by a human. This is going to be much much harder to detect, and so I think there will be a period here where that type of content gets by the search engines. Ultimately the search engines will find a way to get around this, as they have all the other gray hat tactics like keyword stuffing. But I think it might be a few years before we see that, so there could be a shorter term SEO opportunity for originally generated but human edited content.
“What are the best logical and practical “next steps” in terms of getting rid of the monotony that we have to do? I’m talking SEO, a blog nobody reads but we need keywords, naming our images, etc. In theory, in a BING/GPT/Einstein/Watson world is that just one and done?” – Evan
I encourage you to read my comment in this thread to Melissa Sidebottom, as I covered some of this there. But in short, LLM content generation I do not believe will help with SEO in the medium to long term, but there might be some quick wins you can get with it until the search engines start filtering this stuff out.
I think you could potentially write a blog post, and then have a model try to elegantly add certain keywords into that post. I think that things like naming images, alt tags, etc. would be effective as well. Generating ad copy and creative could probably work. Might not need a designer for certain display campaigns. So I think some of the smaller mundane tasks would be a good fit for this technology, but I do not see it as a silver bullet for genuine content creation.
“Question: what exactly is “Data Science” and how do you see it evolving?” – Above the Fold
Data Science as a term first started to pop up in the 1960’s to describe both a new profession and field of academic study to support understanding large amounts of data that had begun to be amassed. Its meaning has evolved over time as an inter-disciplenary field most greatly influenced by computer science and statistics. One of the primary purposes of data science is prediction of, well, anything. Data Science is a broad term of the discipline which includes tactics like machine learning, data analysis, algorithm development, and much more.
While the discipline itself continues to evolve, the most exciting part of the current times in my opinion is the application of this discipline across almost all contexts that you can think of. Applications in generative art, music, and text content have exploded onto the scene thanks in large part to the popularization of ChatGPT. Advances in computer vision, weather forecasting, and autonomous vehicles among myriad more are thanks to the work of data scientists. Now that the infrastructure is available for workers without a PhD to participate, we’re going to see a massive acceleration in the proliferation of these models across all sectors.
4241 Jutland Dr., Suite 300
San Diego, CA 92117