Introduction — Artificial Intelligence in software development
Disclaimer: The article was firstly published on Scalac’s blog.
With the growing discussions about the integration of artificial intelligence (AI) into software development via tools like ChatGPT and Github Copilot, I have explored these AI-driven coding aids for some time. Initially, I engaged with Tabnine, a tool similar in function to Github Copilot, albeit a paid service, and my initial experiences were less than satisfactory. Subsequently, I approached AI-based products from other companies with a certain degree of skepticism.
The tech world is replete with promotional materials and articles that portray AI as a groundbreaking approach to code creation. In some corners, there is talk of AI revolutionizing the very essence of coding. The narrative suggests that developers would no longer need to write individual functions but could rapidly generate entire application modules using these new AI tools.
On the internet, there are numerous demonstrations showcasing how entire applications can be created in mere minutes with the assistance of AI. While the concept appears remarkable, does the ability to generate applications as straightforward as tic-tac-toe or ones with countless examples available online truly signal a revolution in the art of coding?
My personal experiences with AI tools such as ChatGPT and Copilot have been a mix of disappointments and positive discoveries. There have been instances where I abandoned paid tools, only to revisit them later. In my view, the decision of whether to integrate AI into your workflow isn’t a straightforward one. Hinging one’s beliefs on AI as the philosophical cornerstone of programming can lead to considerable frustration when confronted with the technology’s limitations. However, adopting a more open-minded approach can unveil intriguing applications for artificial intelligence.
Key Assumptions
Given that AI tools often provide responses based on the context of prior interactions, this article does not aim to provide specific examples of their use. The outcome of queries can vary for each user. Instead, the focus here is on elucidating the advantages and challenges of employing AI in coding, along with potential areas of application. The examples cited primarily serve to illustrate the issues discussed.
Using Chat Bots — ChatGPT and Github Pilot
Presently, two prominent chatbots for code assistance are ChatGPT and Github Copilot.
ChatGPT facilitates interactions with AI through chat-based communication. Users input text to convey their intentions to the AI, and ChatGPT generates responses. Notably, ChatGPT can respond to programming-related queries. Given its vast training data derived from a substantial portion of the internet, it can proficiently generate substantial portions of application code in response to user input.
In contrast, Github Copilot is exclusively dedicated to programming. It operates within a strict programming context and declines to respond to non-programming queries. While ChatGPT is based on the GPT-3.5 or GPT-4 model (depending on the version), Copilot Chat employs the OpenAI Codex model.
Another difference worth mentioning is that ChatGPT is just a chat while Github Copilot Chat is integrated with IDE and has some extra handlers that allow you to use produced suggestions in the current file or to create new.
Using tools such as ChatGPT or Copilot Chat is a straightforward process.
In the chat window, users can simply input their requirements. For instance, when requesting Copilot Chat to generate a tic-tac-toe game using React, it’s crucial to specify the use of functional components, as Copilot Chat tends to suggest class-based components by default. This phenomenon may be attributed to the presence of outdated code snippets within its training data. Additionally, the generated code may lack accompanying styles.
Since the AI chat window maintains the conversation’s context, it’s feasible to request the generation of styles and even a non-existent file, such as the ‘Board’ referred to by the ‘Game’ component.
Consequently, in a matter of a few interactions, one can receive a rudimentary yet functional tic-tac-toe application.
Similar interactions can be accomplished using ChatGPT. Given the context retention, additional modifications or conversions, like translating the code into TypeScript, can be seamlessly requested.
Challenges encountered in the utilization of AI chatbots
1. Suboptimal recommendations for less common libraries and unconventional use cases
By way of illustration, when employing Github Chat for the Rust programming language, I was able to derive rudimentary GraphQL configurations using the Juniper library and Axum framework. However, as I proceeded to request a transition to the async-graphql library, Copilot Chat executed this transformation with relatively few errors, albeit necessitating minor adjustments. Yet, as my inquiries extended toward implementing queries or mutations, the codebase exhibited a growing presence of errors.
At some point, I found myself expending more time fixing non-functional code than the duration that would have sufficed to author the code from scratch. Referring back to our example tic-tac-toe game, when requesting an artificial intelligence to transcribe the code into the Rescript language, the resulting output, in my instance, featured a litany of inaccuracies and critical errors.
2. Reviewing new iterations of code imposes considerable cognitive load
An additional predicament entails that both ChatGPT and Github Copilot Chat tend to produce integral code files. Consequently, the ensuing code volume may render comprehensive inspection challenging. Moreover, the code might be so similar to previous iterations that it is easy to assume that nothing significant has changed. Frequently, both of these tools introduced changes to my code that I did not intend.
3. Mismatched Variable and Function Nomenclature
Chatbots respond based on past conversations. Without this awareness, you might notice that AI-generated code often uses different names for variables, functions, and modules. It’s important to know that tools like Github Copilot Chat focus mostly on the current conversation, so the code suggestions they provide might look similar but not match the naming style we want to incorporate in our project. They often suggest names that “fit” instead of using the patterns we already use in our project. While you can manually correct the suggestions, it can be frustrating and may not be worth the effort compared to typing simple code chank manually.
4. Code Often Does Not Function
Foremost among my vexations about chatbots resides the pervasive issue that their generated code is often difunctional. When such shortcomings manifest as nomenclatural inaccuracies, they are, in essence, of minor consequence. However, this issue assumes a more nettlesome dimension when AI endeavors to advocate for functions or libraries that are non-existent. Notably, ChatGPT is capable of composing code replete with seemingly efficacious functions, albeit absent from actual libraries. This deficiency may not be a product of their non-existence, but rather a consequence of the enhanced complexity characterizing a problem we’re trying to solve. Moreover, ChatGPT can provide us with links to documentation that doesn’t exist.
Recollections of my prior encounters include a scenario in which I tried to add a month to a date using the Rust-based Chrono library. The solution proffered by ChatGPT exuded an air of ostensible simplicity, notwithstanding the verity that the underlying conundrum was markedly more intricate.
5. Textual Chat Flow Differs from Programming Flow
This predicament carries an inherent degree of subjectivity, as the intricacies of mental representations of the act of programming can widely diverge amongst individuals. For myself, my cogitations concerning the realization of a particular feature pertain to the structural constructs afforded by the programming language in question, the functions within my repertoire, and how modules may be interconnected to furnish the desired outcome. Consequently, the articulation of these cognitive frameworks must be transmuted into language amenable to transcription within a chat interface. This transmutation engenders a disruptive effect on the conventional flow inherent to my coding practice. The result is emblematic of a broader trend in which the temporal and intellectual investments requisite for the utilization of AI often exceed the inputs indispensable for self-authored code production.
Using AI chatbots — positive aspects
Notwithstanding the multiplicity of concerns explicated in connection with chatbots, certain favorable applications emerged during my engagement with these tools:
1. Facilitating Exploration in Unfamiliar Domains
Occasionally, programmers are compelled to navigate uncharted territories. The commonplace recourse, which frequently entails an excursion into the realm of Stack Overflow, becomes a challenging endeavor when the absence of familiarity with the subject precludes the formulation of a coherent query. In my own experience, I endeavored to configure a Github Action that would expedite the orchestration of my server, database, and integration testing procedures. My understanding of Github Actions remained rudimentary at best. Notably, the configuration bore a semblance to Docker Compose, though the precise orchestration of distinct services or containers remained obscure to me. It was then that I beseeched ChatGPT:
“Can you assist me in configuring Github Actions to enable the deployment of an application and its associated database as Github service runners, followed by the execution of a Postman collection to assess the application’s functionality?”
The response contained a fragment of non-functional configuration. However, it included fundamental components that constitute a Github Action configuration. This provided me with a general idea of how the desired setup should look and what issues I needed to address. Even if a chatbot cannot provide a 100% accurate answer, it can help us refine our questions or narrow down our search areas.
2. Employing chatbots as an interactive rubber duck
One of the fundamental lessons I imbibed during my formative years as a programmer was the wisdom of exploring a problem myself before seeking the counsel of a more seasoned developer. Deliberation is often aided by an abstractional listener, conventionally depicted as a rubber duck. Chatbots give us an additional advantage. Apart from the requirement to ponder how our question should be phrased before asking it, the advantage of AI over a rubber duck is that it can respond. Despite my earlier comments about AI often providing non-working solutions, when we do not need a specific solution but rather ideas on how to proceed, text generated by artificial intelligence can offer a fresh perspective on a problem that we did not have previously.
It can do even more explaining concepts from languages we don’t know, or popular libraries that we’d like to learn.
3. Streamlining the Configuration of Fundamental Structures
ChatGPT and Github Copilot are effective at providing the basic structure for some applications
For instance, I approached Github Copilot for the development of a simplistic backend application employing Express.js. On another occasion, I sought the establishment of a Rust-based backend server employing the Axum framework and async-graphql. In both instances, chatbots demonstrated their ability to fulfill these requests. When we use a multitude of tools in our work, it can become easy to forget every detail required to start a new instance of our application. AI makes it much easier to begin from scratch without delving into documentation for simple tasks.
4. Operations on selected code
This is the Github Copilot Chat feature. It allows you to select a portion of the code and ask Copilot to perform various actions. You can request an explanation of what it does or ask for tasks like translating a code fragment to another language, refactoring it, or even writing a unit test. For example, in the example below, I asked it to rewrite the JavaScript switch to if else.
Extensions for Integrated Development Environments (IDEs)
In contrast to chatbots, IDE extensions operate more akin to the Language Server Protocol (LSP). Upon code entry, AI within IDE extensions endeavors to provide suggestions on how best to complete a given code fragment.
AI IDE extensions can generate code suggestions from code comments:
One more interaction is viable by an interactive mode that enables one to pick from many suggestions:
How to open interactive mode depends on IDE you’re using.
LSP vs. AI
For some time now, many programming languages have adopted their own Language Server Protocol (LSP). LSP allows you to choose suggestions based on existing code. For instance, if a module we intend to use contains a particular function, LSP will often suggest its use, including details about its parameters and return type. This type of assistance works particularly well with statically typed languages. A prime example of this is the Rust programming language, where LSP not only suggests methods and types but can even generate templates for trait implementation.
In the example provided, I’m using LSP’s suggestions to fill suggestion for method I should implement.
Given that LSP can provide so much guidance, what does AI bring to the table? While LSP helps us work more efficiently with the code we already have, AI tools attempt to guess our intentions and suggest code accordingly. In other words, the suggestion may encompass more than just using a single method; it could include the logic for an entire function or even a complete module.
Now that we understand how tools like Github Copilot, CodeWhisperer, or Tabnine operate, let’s explore the problems they may present and the benefits they can offer:
Cons
1. Naming Issues
Similar to chatbots and in contrast to LSP, AI-based tools do not verify the correctness of the methods we intend to use. Instead, they aim to suggest what seems most suitable based on extensive training data. Completing our code often results in references to code that doesn’t exist or exists elsewhere in our application with different names. Surprisingly, even after extended use of IDE plugins, it may offer incorrect type or method names. Sometimes, especially with small functions, it is less frustrating to write the code from scratch than to correct these erroneous names.
For example, the suggested variable name in the given example is incorrect.
2. Nonfunctional Code and Inaccurate Suggestions
From my experience, suggestions from comments are often inaccurate.
In the example provided below, I stopped completing subsequent suggestions when Github Copilot started including dependencies related to the implementation of websockets for a simple “hello world” Express.js application:
In the example below, when I switched to interactive mode with Github Copilot, which allows us to choose from various suggestions, none of them worked.
3. Lack of Repetition — Erosion of Skills/Memory
An often less obvious issue when using artificial intelligence is that, by taking over repetitive code completion tasks, AI may hinder more than it helps. Repetition is an important element of learning and retaining knowledge. If we stop performing repetitions because AI handles them for us, it could become problematic when working on a project where AI-powered programming tools aren’t available. It might result in difficulties remembering methods from a particular library or how to set up a basic configuration for a system.
4. Visual Clutter and Reduced Focus
Suggestions are often displayed on the screen in some form and can actively change as we write code. Initially, resisting the temptation to analyze suggestions is challenging. When combined with regular LSP suggestions, it may appear as though the screen is in chaos, which can negatively affect focus.
Pros
1. Faster Boilerplate Code Completion
Just as with ChatGPT or Github Copilot, IDE plugins like Copilot or CodeWhisperer are quite useful for filling in repetitive, “boring” code that can’t be or shouldn’t be generalized. However, it’s crucial to note that AI-assisted code completion for repetitive code is most effective when working with the same file or during the same coding session.
It may not look like a lot, but the fact AI fills lines we would need to produce in repeated manner can save a few moments there and allow us to focus on more important and challenging tasks.
2. Minor Assistance
AI learns alongside our coding, and its suggestions improve over time. While it’s not a revolutionary change that allows you to create entire modules or systems with just a few clicks, the assistance it provides is significantly more subtle. It often involves small tasks that might not be immediately noticeable but can make a difference over time.
After some time of using Copilot, it became difficult for me to distinguish whether I was using a Copilot suggestion or the Language Server Protocol (LSP) because I had developed a habit of relying on the tool.
For example, when I need to create a new function with known parameters and return types, it saves a lot of typing. Here, an AI suggestion matches what I need. In other instances, when I have an array and want to map it, I’ve done similar tasks in other places. With a few taps, AI provides a template that closely aligns with my requirements, and I only need to adjust one callback.
In my previous points, I mentioned various drawbacks and challenges, as there are many. Learning to ignore what doesn’t match can be a significant adjustment. AI tools can be incredibly useful, and there comes a point where everything clicks. You type, and AI seems to understand your intentions almost as if it’s reading your mind. However, there are also times when you ask a chatbot for something, and it provides faulty examples, leading to frustrating debates and time spent correcting the code without much progress.
Legal notice
In terms of intellectual property and questions it brings in terms of AI output and suggestions I could describe 3 main topics that are interconnected:
Who is the owner of the code created by AI?
The first question revolves around ownership of the generated code. While it may seem like a straightforward inquiry, the answer is far from simple. If it were as straightforward as asking who owns the rights to suggestion output, GitHub Copilot provides a response on its page, which you can find at this link: https://github.com/customer-terms/github-copilot-product-specific-terms.
Specifically, it states:
‘2. Ownership of Suggestions and Your Code. GitHub does not claim any ownership rights in Suggestions. You retain ownership of Your Code.’
But what about other tools like ChatGPT? You can explore their rules and agreements or attempt to seek a response by asking ChatGPT directly.
ChatGPT Owner (Organization/Platform): The organization or platform that operates ChatGPT typically retains ownership and control over the technology, including the outputs it generates. They may have terms of service that outline how the outputs can be used and any restrictions.
Person Providing Input: The individual providing input to ChatGPT usually has the right to use the output for their intended purposes, subject to any terms and conditions set by the owner of ChatGPT. However, there may be limitations on the use of the outputs, such as not using the technology for illegal or harmful activities.
In essence, ChatGPT’s owner retains ownership and allows users to use its output in certain circumstances. If you’re a developer considering using ChatGPT or another tool in your project, it’s advisable to consult with your legal team.
AI-generated code subject to licensing
The complexity of the issue with Github Copilot leads us to a fundamental problem: what if generative AI produces code that is the same or similar to code that is already copyrighted? This scenario can occur, as evidenced by this striking example:
According to Github Copilot’s page:
What if I’m accused of copyright infringement based on using a GitHub Copilot suggestion?
GitHub will defend you as provided in the GitHub Copilot Product Specific Terms.
Additionally, GitHub attempts to implement measures and filters to ensure that output code does not replicate existing copyrighted code. It even highlights in Github Copilot features that scan if the output copies existing code.
CodeWhisperer, on the other hand, states that developers own the code generated from suggestions but are also responsible for it. https://aws.amazon.com/codewhisperer/faqs/#:~:text=Who%20owns%20the%20code%20generated,CodeWhisperer%20suggestions%20that%20you%20accept.
It also mentions that if the code is similar to code from the training data it may provide links to reference and it is possible to turn off suggestions with references to existing code.
Is this enough to use the tool responsibly in your project? I can’t provide a definitive answer, as it often requires approval from your legal team.
Uploading code - external server
In almost every case, receiving AI suggestions involves sending code to a third-party service (except for the option in Tabnine Pro, which requires explicit configuration).
GitHub Copilot provides a comprehensive guide on how your data sent to the cloud is secured. However, there are companies with policies that prohibit sharing code with external entities, regardless of the license and agreement in place. If you work in research and development (R&D) or security or on projects with strict copyright rules, you may not be allowed to use such tools.
I would always recommend seeking approval from a product owner when considering the use of AI tools. For some projects, it may be a straightforward approval process, while for others, it may require an in-depth legal investigation. Sometimes, supervisors may be hesitant to allocate resources for such assessments, understanding the potential complexity of the legal process.
In summary, I have to notice artificial intelligence tools are absolutely novum to legal systems. The development of rules and regulations in this area is still unclear. Governments may attempt to address the changes introduced by AI tools, which bring new challenges. Comfortable usage may necessitate frequent updates to the rules. On the other hand, companies behind generative AI models also seem to recognize these issues and are working to mitigate them in some way.
AI or Not: The Complex Reality
In the text, I’ve highlighted a plethora of issues that arise when utilizing artificial intelligence for programming. This stems from my initial experiences, which were profoundly influenced by marketing materials touting a game-changing revolution.
The use of AI-based tools, however, doesn’t represent a revolutionary shift in the way developers craft code; it’s more of an evolution, offering incremental improvements over traditional Language Server Protocol (LSP). These tools indeed take us a step further, making coding a bit faster and more enjoyable. Nevertheless, unlocking their potential necessitates an investment of time to grasp their nuances, ignore superfluous suggestions, and discern when adopting their guidance will be beneficial or lead to a time-consuming debugging process.
When it comes to the decision of whether to embrace AI-based programming tools, the answer isn’t clear-cut. They can be a boon for some, while for others, they might serve as distractions or obstacles to learning. The most effective way to determine their utility is to experiment with them personally.
AI Replacing Programmers?
I think the answer to this question is probably known to people who have used tools like ChatGPT or Github Copilot.
Drawing from my experience of over eight years as a programmer, I’ve come to appreciate that programming is an ongoing journey of acquiring new knowledge and leveraging emerging libraries.
However, given the magnitude and diverse nature of errors that AI can introduce, along with the ever-evolving landscape of programming languages and libraries, AI may never fully replace human programmers. The distinction lies in the capability of human comprehension, even with imperfect sentences, whereas computer programs cannot afford any errors. AI is grounded in existing knowledge, whereas we, as commercial developers, frequently embark on creating something novel.
In my opinion, developers might sleep well… at least for now.
Tools mentioned:
- Github Copilot with Github Copilot Chat: GitHub Copilot · Your AI pair programmer
- CodeWhisperer: AI Code Generator — Amazon CodeWhisperer — AWS
- Tabnine: Tabnine is an AI assistant that speeds up delivery and keeps your code safe
- ChatGpt: https://chat.openai.com/
Worth notice:
- tool dedicated for PRs review: https://github.com/Codium-ai/pr-agent
- Very interesting tool to produce software: https://github.com/OpenBMB/ChatDev
Many more are available on the internet.
Extra Note
As an additional note, I want to mention that I used ChatGPT to assist me in writing this article. It’s important to clarify that I didn’t ask the AI to generate the text for me. I wrote the text myself, but I employed ChatGPT to aid in editing and improving the text. Just like with code, the results weren’t flawless, but the assistance I received was tremendously valuable, showcasing another great way to leverage this technology.