Consider a world where the software that powers your favorite apps, protects your online transactions, and keeps your digital life alive could be overtaken and taken over by a cleverly disguised piece of code. This isn’t the plot of the latest internet thriller; it’s actually been a reality for years. How that will change—either in a positive or negative direction—as artificial intelligence (AI) plays a greater role in software development is one of the great uncertainties of this brave new world.
In an era when artificial intelligence promises to revolutionize the way we live and work, discussions about its security implications cannot be marginalized.As we increasingly rely on artificial intelligence to complete tasks ranging from mundane to mission-critical, the question is no longer just “Can artificial intelligence Strengthen network security? Being hacked?” (Yes!), “Can artificial intelligence be used? Hacking?” (Of course!), and “Will artificial intelligence do it? Production security software? The practice of necessity protects our digital future.
You can test your secure coding skills with this short quiz self assessment.
The security paradox of artificial intelligence
Artificial intelligence’s leap from academic curiosity to the cornerstone of modern innovation happened quite suddenly. Its applications span an astonishing range of areas, providing solutions that once existed only in science fiction novels. However, this rapid advancement and adoption has outpaced the development of corresponding security measures, leaving both AI systems and AI-created systems vulnerable to a variety of sophisticated attacks. Deja vu? The same thing happened when software itself took over many areas of our lives…
At the heart of many artificial intelligence systems is machine learning, a technique that relies on extensive data sets to “learn” and make decisions. Ironically, AI’s strength—the ability to process and summarize large amounts of data—is also its Achilles’ heel. The starting point of “anything we find on the Internet” may not be perfect training material; unfortunately, people’s wisdom the masses It may not be enough in this case. Additionally, hackers with the right tools and knowledge can manipulate this material to trick artificial intelligence into making poor decisions or taking malicious actions.
copilot in crosshairs
Powered by OpenAI’s Codex, GitHub Copilot demonstrates the potential of artificial intelligence in coding. It’s designed to increase productivity by suggesting snippets of code or even entire blocks of code. However, multiple studies have highlighted the dangers of relying solely on this technology. It turns out that a large portion of the code generated by Copilot may contain security flaws, including vulnerabilities for common attacks such as SQL injection and buffer overflows.
The “garbage in, garbage out” (GIGO) principle is particularly important here. Artificial intelligence models, including Copilot, are trained on existing data, and like other large language models, much of the training is unsupervised. If this training data is flawed (which is likely given that it comes from an open source project or a large Q&A site like Stack Overflow), the output, including code suggestions, may inherit and propagate those flaws. In the early days of Copilot, a study showed that when Copilot was asked to complete code based on the CWE Top 25 samples, approximately 40% of the code samples were vulnerable to attacks, which emphasized the GIGO principle and the need to increase security awareness. A larger study in 2023 (Is GitHub’s Copilot as bad as humans at introducing vulnerabilities in code?) had slightly better results, but still far from ideal: by removing the vulnerable line of code from a real-world vulnerability example and asking Copilot to finish it, it recreated it in about 1/3 of the time The vulnerability was discovered and only fixed about 1 bug/4 times. Additionally, it performs very poorly on vulnerabilities related to lack of input validation, resulting in vulnerable code every time. This highlights that generative AI is less capable of handling malicious input without a “silver bullet” solution for dealing with vulnerabilities, such as prepared statements.
The road to secure artificial intelligence-driven software development
Solving the security challenges posed by artificial intelligence and tools like Copilot requires a multifaceted approach:
- Understand vulnerabilities: It is important to recognize that AI-generated code may be vulnerable to the same types of attacks as “traditionally” developed software.
- Improve secure coding practices: Developers must be trained in secure coding practices while taking into account the nuances of AI-generated code. This involves not only identifying potential vulnerabilities, but also understanding the mechanisms by which artificial intelligence suggests certain snippets of code to effectively predict and mitigate risks.
- Adjust SDLC: Not just technology. Processes should also consider the subtle changes that AI will bring. But requirements, design, maintenance, testing, and operations can also benefit from large language models.
- Continuous vigilance and improvement: Artificial intelligence systems—like the tools they power—are constantly evolving. Keeping pace with this development means staying up to date on the latest security research, understanding emerging vulnerabilities and updating existing security practices accordingly.
Integrating AI tools like GitHub Copilot into the software development process is risky and requires not only a change in mindset but also robust strategies and technical solutions to mitigate potential vulnerabilities. Here are some practical tips designed to help developers ensure that using Copilot and similar AI-driven tools can increase productivity without compromising security.
Implement strict input validation!
actual implementation: Defensive programming has always been at the core of secure coding. When accepting Copilot’s code suggestions, implement strict input validation measures, especially for functions that handle user input. Define user input rules, create allow lists of allowed characters and data formats, and ensure input is validated before processing. You can also ask Copilot to do this for you; sometimes It actually works quite well!
Manage dependencies securely!
actual implementation: Copilot may suggest adding dependencies to your project, which attackers may use to conduct supply chain attacks through “package illusion”. Before merging any proposed library, please manually verify its security status by checking for known vulnerabilities in a database such as the National Vulnerability Database (NVD), or using OWASP dependency checking or npm auditing for Node.js projects, etc. Tools perform Software Combination Analysis (SCA). These tools can automatically track and manage dependency security.
Conduct regular security assessments!
actual implementation: Regardless of the source of your code, whether it’s AI-generated or hand-written, code reviews and testing should be done regularly with a focus on security. Combined methods. Static testing (SAST) and dynamic testing (DAST), and software composition analysis (SCA). Perform manual testing and supplement it with automation. But remember to put people over tools: no tool or artificial intelligence can replace natural (human) intelligence.
Step by step!
actual implementation: First, ask Copilot to write your comments or debug logs – these are pretty good. However, any errors will not affect the security of the code. Then, once you’re familiar with how it works, you can gradually let it generate more and more snippets of code for actual functionality.
Always check out what Copilot has to offer!
actual implementation: Never blindly accept the advice of your co-pilot. Remember, you are the pilot and it “just” co-pilot! You and Copilot can be a very effective team, but the responsibility is still yours, so you must know what the expected code is and what the results should look like.
experiment!
actual implementation: Try different things and tips (in chat mode). If you’re not satisfied with the results you get, try asking Copilot to improve the code. Try to understand how Copilot “thinks” in certain situations and recognize its strengths and weaknesses. Plus, Copilot gets better and better over time – so keep trying!
Stay informed and educated!
actual implementation: Continuously keep yourself and your team informed of the latest security threats and best practices. Follow security blogs, attend webinars and workshops, and participate in forums dedicated to secure coding. Knowledge is a powerful tool for identifying and mitigating potential vulnerabilities in your code, whether generated by artificial intelligence or not.
in conclusion
As we navigate the uncharted waters of AI-generated code, the importance of secure coding practices has never been more important. Tools like GitHub Copilot offer tremendous opportunities for growth and improvement, but also present unique challenges when it comes to code security. Only by understanding these risks can we successfully reconcile effectiveness with security and protect our infrastructure and data. Along the way, Cydrill remains committed to providing developers with the knowledge and tools they need to build a more secure digital future.
Cydrill’s blended learning journey provides proactive and effective secure coding training to developers from Fortune 500 companies around the world. By combining instructor-led training, e-learning, hands-on labs, and gamification, Cydrill offers a novel and effective way to learn how to code securely.
Check out Cydrill’s secure coding course.