Disclaimer: I worked on this project independently during personal time. Nothing here represents the views or endorsement of SGS. Any opinions, findings, and conclusions expressed in this blog post are solely mine. The project utilizes Python, OpenAI’s language models and STARLIMS mock code I created, which may have separate terms of use and licensing agreements.
AI is not a magician; it’s a tool. It is a tool for amplifying innovation
Fei-Fei Li
With this in mind, imagine: what if we could automatically get STARLIMS Code Review feedback? You know, an extra quality layer powered by AI?
“STARLIMS is a proprietary language” you will say.
“It has to be trained” you will say.
True; yet, what if?…
I have done another experiment. I was given a challenge to try Python and OpenAI API, but I wasn’t really given any background. Given my recent fun with CI/CD and the fact I’m back working on the STARLIMS product, I thought “Can we automatically analyze STARLIMS Code? Like an automated code-reviewer?” Well, yes, we can!
As I was recommended a long time ago, with this project, let me show you the end result. I have a REST API running locally (Python Flask) with the following 2 endpoints:
POST /analyze/<language>
POST /analyze/<language>/<session_id>
The 1st will kick a new analysis session, and the 2nd allows the user to continue their analysis session (like queuing scripts and relating them together!!!)
I usually create nice diagrams, but for this, really, the idea is
STARLIMS <-> Python <-> Open AI
So no diagram for you today! How does it work?
I can pass SSL code to a REST API, and receive this:
{
"analysis": {
"feedback": [
{
"explanation": "Defaulting a parameter with a numeric value may lead to potential issues if the parameter is expected to be a string. It's safer to default to 'NIL' or an empty string when dealing with non-numeric parameters.",
"snippet": ":DEFAULT nItemId, 1234;",
"start_line": 4,
"suggestion": "Consider defaulting 'nItemId' to 'NIL' or an empty string depending on the expected data type.",
"type": "Optimization"
}
]
},
"session_id": "aa4b3bd3-75bd-42e3-8f31-e53502e68256"
}
It works with STARLIMS Scripting Language (SSL), STARLIMS Data sources (DS) and … JScript! Here’s an example of a JScript output:
{
"analysis": {
"items": [
{
"detailed explanation": "Checking for an empty string using the comparison operator '==' is correct, but using 'Trim()' method before checking can eliminate leading and trailing white spaces.",
"feedback type": "Optimization",
"snippet of code": "if (strMaterialType == '')",
"start line number": 47,
"suggestion": "Update the condition to check for an empty string after trimming: if (strMaterialType.trim() === '')"
},
{
"detailed explanation": "Using the logical NOT operator '!' to check if 'addmattypelok' is false is correct. However, for better readability and to avoid potential issues, it is recommended to explicitly compare with 'false'.",
"feedback type": "Optimization",
"snippet of code": "if (!addmattypelok)",
"start line number": 51,
"suggestion": "Update the condition to compare with 'false': if (addmattypelok === false)"
},
{
"detailed explanation": "Checking the focused element is a good practice. However, using 'Focused' property directly can lead to potential issues if the property is not correctly handled in certain scenarios.",
"feedback type": "Optimization",
"snippet of code": "if ( btnCancel.Focused )",
"start line number": 58,
"suggestion": "Add a check to ensure 'btnCancel' is not null before checking its 'Focused' property."
}
]
},
"session_id": "7e111d84-d6f4-4ab0-8dd6-f96022c76cff"
}
How cool is that? To achieve this, I used Python and OpenAI API. I had to purchase some credits; but really, it is cheap enough and worth it when used to a small scale (like a development team). I put 10$ in there, and I have been running many tests (maybe a few hundreds) and I am down by 0.06$, so… I would say worth it.
The beauty of this is my project supports this:
Add new languages in 5 minutes (just need to add the class, update the prompt, add the reference code, restart the app, go!)
Enhance accuracy by providing good code, training the assistant what is valid code
To give you an idea, the project is very small:
Looking ahead with this small project, I’m thinking beyond just checking code for errors. Imagine if we could hook it up to our DevOps setup, like Azure DevOps or SonarQube. It would be like having a digital assistant that not only spots issues but also files bugs and suggests improvements automatically! This means smoother teamwork, better software quality, and fewer headaches for all of us.
Now that I got this working, I am thinking about bunch of exciting ideas like:
Integrate this as a Quality Gate on commits.
If it fails, goes back to developer
If it succeeds, record the results and run the pull request (or push to the next human reviewer)
Implement a mechanism for automatic Unit Tests generation (we potentially can do something there!)
Implement a mechanism for code coverage report (also possible!)
Integration of these to STARLIMS directly so we can benefit from this and include in a CI/CD pipeline somehow
Dreaming is free, is it not? Well, not quite in this case, but I’m good for another 9.94$…
I have the repo set as private on Github. This is a POC, but I think it can be a very cool thing for STARLIMS, but also will work for any other proprietary language if I get some good sample code.
Hell, it can even work for already supported languages like Javascript, c#, or anything, without training! So we could use this pattern for virtually any code review.
Alright, so if you followed along the previous post, you know I have setup Jenkins to kind of run continuous integration. Well, I have now pushed it a bit further.
I installed a docker image of SonarQube (the community edition) and wow, do I have only one regret: I should have started with all of this setup on day one.
My flow is now this:
So, in a nutshell, what is VERY COOL is that when I push code on my develop branch, this happens automatically:
unit tests are triggered
code analysis is triggered
And in SonarQube code analysis, I found bunch of interesting suggestions, enhancements and bug fixes. They were not necessarily product-breaking, but I found many things I was not even aware of. My future code will just be better.
CD pipeline?
I also added a CD pipeline for my test environment. I am not ready yet to put quality gates to automate production deployment, but I am on the right track! Here is my current CD pipeline:
It is quite simple, but it works just perfect!
Now, I was wondering if this would be too much for my server. You know, running all of these:
Verdaccio docker image (npm private repository)
Jenkins docker image (CI/CD pipelines)
SonarQube docker image (code analysis)
3 Tests docker images (React frontend, Node backend, Service manager)
3 Production docker images (same as just before)
Nginx docker image (reverse proxy)
Prometheus & Grafana (directly, not docker images) for system monitoring
Here’s what happens:
More or less: NOTHING.
Well, not enough to be concerned about it yet. Of course, there’s not a lot of users, but I expect even with a few dozen of users, it wouldn’t be so bad. And if this became really serious, the production environments would be hosted on the cloud somewhere for 100% uptime (at least as a target).
To be honest, the tough part was to get the correct Jenkinsfile structure – just because I am not used to it. For safe keeping, I am writing my two pipelines here, and who knows, maybe it can help you too!
This whole article started as an attempt at sharing steps to get a free “no-cloud” platform for continuous integration and continuous deployment. What triggered it? The time I spent doing npm install, npm build, npm publish, left and right, forgetting one, npm test, oops I forgot one npm test and did a docker build … Damned! was that time consuming and lousy activities.
I want to code. I don’t want to do this. Let’s separate the concern: I code, the server does the rest. Sounds good? We call this separation of concerns, at a whole new level!
What is separation of concerns?
Short answer: Do what you are supposed to do and do it right.
Not so short answer: It is the principle of breaking down a complex system into distinct, independent parts or modules, each addressing a specific responsibility or concern, to promote modularity, maintainability, and scalability. It is usually applied at many (if not all) levels, like architecture, component, class, method, data, presentation, testing, infrastructure, deployment, etc.
… and why should you care? (Why it matters)
My article in itself? It doesn’t really matter, and you shouldn’t care. Unless, that is, you find yourself in this situation where you hear about continuous integration and deployment, but you don’t really know where to start. Or if you have your own software you’re working on and want to get to a next level. Or just because you like reading my stuff, who knows!
I recently started to flip this blog into a melting pot of everything I face on this personal project. Eventually, I may even release something! And then, we can have a nice history of how it got there! For the posterity!
Anyway, I am diverging from the original intent… I want to share the journey I went through to get a working environment with Jenkins and Verdaccio. I think it is a great platform for startups who can’t or won’t afford cloud hosting just yet (or for privacy reasons) but still want to achieve some level of automation.
As a reference, I’m sharing the challenges I am facing with a personal project consisting of a Node backend, a React frontend, and a background server, and how I tackle these challenges using modern tools and techniques.
I want to try something backward. Conclusion first! I completed the setup, and it works. It was painful, long, and not fun at some points.
But look at the above screenshot! Every time I push to one of my Github repo, it triggers build and test. In one case, it even publishes to my private package management registry with the :develop tag! Isn’t it magical?
If you are interested in how I got there, tag along. Otherwise, have a great day! (still, you should meet my new friend, Jenkins)
Before we begin, here are some definitions (if you know what these are, just ignore).
You never know who will read your articles (if anyone). Play dumb.
Definitions
Continuous Integration (CI) and Continuous Deployment (CD): CI/CD are practices that automate the process of building, testing, and deploying software applications. CI ensures that code changes from multiple developers are regularly merged and tested, while CD automates the deployment of these tested changes to production or staging environments.
Node.js: Node.js is a runtime environment that allows developers to run JavaScript code outside of a web browser. It’s commonly used for building server-side applications, APIs, and real-time applications.
Docker: Docker is a platform that simplifies the process of creating, deploying, and running applications using containers. Containers are lightweight, standalone executable packages that include everything needed to run an application, including the code, runtime, system tools, and libraries.
Containers: Containers are isolated environments that package an application and its dependencies together. They provide a consistent and reproducible runtime environment, ensuring that the application runs the same way regardless of the underlying infrastructure. Containers are lightweight and portable, making them easier to deploy and manage than traditional virtual machines.
Let’s Begin!
Project Structure
Project 1: React Frontend
Project 2: Common backend (most models, some repos and some services)
Project 3: Node Backend (API)
Project 4: Node Worker (processing and collection monitoring)
Environments
I run development on whatever machine I am using at the moment with nodemon and vite development mode
I build & run Docker containers on my linux server with docker compose (3 Docker files and 1 docker compose)
I have a nginx reverse proxy on the same server for SSL and dynamic IP (no-ip)
Objective
Achieve full CI/CD so I can onboard remote team members (and do what I like: code!)
IF I am successful, this will be a robust and scalable development workflow that would streamline the entire software development life cycle, from coding to deployment, for my project. I think in the enterprise, with similar tools, this proactive approach would lay a good foundation for efficient collaboration, rapid iteration, and reliable software delivery, ultimately reducing time-to-market and increasing overall productivity.
Couple of additional challenges
Challenge #1: Private Package Management. NPM Registry is public. I want to keep my projects private; how can I have a package for the common backend components?
Challenge #2: Continuous Integration (CI). How can I implement a continuous integration pipeline with this? Code is on Github, registry will be private… How do I do that?
Challenge #3: Continuous Deployment (CD). How can I implement a continuous deployment process? I will need to automate some testing and quality gates in the process, so how will that work?
Challenge #4: Future Migration to Continuous Delivery (CD+). Can I migrate to continuous delivery in the future (you know, if I ever have customers?)
Challenge #5: Cloud Migration / readiness. When / if this becomes serious, can my solution be migrated to a cloud provider to reduce hardware failure risk?
With this in mind, I have decided to embark on a journey to attempt to setup a stable environment to address achieve this and face each challenge. Will you take the blue pill and go back to your routine, or the red pill and follow me down the rabbit hole?..
Starting point: bare metal (target: Ubuntu Server, Jenkins, Verdaccio, Docker)
I have this Trigkey S5 miniPC which I think is pretty awesome for the price. It comes with Windows 11, but from what I read everywhere, to host what I want, I should go with a linux distro. So there I go, I install some linux distro from a USB key and cross fingers it boots…
I went with Ubuntu Server (24.04 LTS). BTW, the miniPC is a Ryzen 5800H with 32GB RAM, should be sufficient for a while. On there, I have the installed these – pretty straightforward and you can find many tutorials online, so I won’t go in details:
Docker (engine & compose)
Git
Cockpit (this makes it easier for me to manage my server)
I also have a NGINX reverse proxy container. You can google something like nginx reverse proxy ssl letsencrypt docker and you’ll find great tutorials on setting this up as well. I may write another article later when I reach that point for some items (if required in this journey). But really, that’s gravy at this stage.
Install and Configure Jenkins
Jenkins is an open-source automation server that provides CI/CD pipeline solutions. From what I could gather, we can use Jenkins Docker image for easy management and portability, and it strikes a good balance between flexibility and simplicity. Disclaimer: it is my first experiment with Jenkins, so I have no clue how this will come out…
1. Let’s first prepare some place to store Jenkins’ data:
sudo mkdir /opt/jenkins
2. Download the Docker image:
sudo docker pull jenkins/jenkins:lts
3. Created a docker-compose.yml file to run jenkins:
5. Since I mounted ./jenkins as my jenkins home, I can just run cat jenkins/secrets/initialAdminPassword and I can get the initial admin password, and continue. (for some reasons, I had to paste and click continue twice, then it worked)
I went with the recommended plugins to begin with. According to the documentation, we can easily add more later.
Install and Configure Verdaccio
Verdaccio will be my private package management registry. To install it, I just created a docker compose file, setup some directories, and boom.
Well, I was not prepared for this. I spent a lot of time, but since Jenkins runs in a container, it does not share everything with the host, and somehow, adding the private keys were not adding the known hosts. So I had to run these commands:
me@server:~/jenkins$ sudo docker exec -it jenkins bash
jenkins@257100f0320f:/$ git ls-remote -h [email protected]:michelroberge/myproject.git HEAD
The authenticity of host 'github.com (140.82.113.3)' can't be established.
ED25519 key fingerprint is SHA256:+...
This key is not known by any other names.
Are you sure you want to continue connecting (yes/no/[fingerprint])? yes
Warning: Permanently added 'github.com' (ED25519) to the list of known hosts.
After that, this worked. It is only later I found that I need to pass –legacy-auth to the npm login when in headless mode. Moving forward, that won’t be a problem anymore.
Problem 2 – npm not found
Who would have thought. Node is not installed by default, you need to add the plugin. Once added, a typical workflow will need to include it! Something like:
pipeline {
agent any
tools {nodejs "node"}
stages {
stage('Install') {
steps {
sh 'npm install'
}
}
stage('Build Library') {
steps {
sh 'npm run build:lib'
}
}
stage('Build Application') {
steps {
sh 'npm run build:app'
}
}
}
}
the tools section is what matters. I have named my NodeJs installation node, hence the “node” in the name. Now that I played with it, I understand: this allows me to have different node versions and use the one I want in the workflow I want. Neat!
And finally, I got something happening:
First step achieved! Now I can add npm run test to have my unit tests running automatically.
This is nice, but it is not triggered automatically. Since I use Github, I can leverage the webhooks through the github trigger:
Then all I need is to put a webhook trigger in github that will point to https://<my-public-address-for-jenkins>:<port>/github-hook/ and that’s it!
The result
With this, I can now build a fully automated CI pipeline. Now, what is fully automated? That’s where the heavy-lifting begins. I will be exploring and reading about it more in the next weeks, but ultimately, I want to automate this:
Develop branch CI – when I push anything to the develop branch, run the following tasks:
pull from develop
install (install dependencies)
build (build the repo)
test (run the unit tests, API tests, e2e tests, etc. depending on the repo)
Staging branch CD – when I do a pull request from develop into staging branch, run the following tasks:
pull from staging
install
build
test (yes, same again)
host in testing environment (docker)
load tests (new suite of test to verify response under heavy load)
I will then do “health checks”, analyze, and decide if I can/should do a pull from staging into main.
Main branch CD – when I do a pull request from staging into main, run the following tasks:
pull from main
install
build
test (of course!)
host in staging environment (docker)
do some check, and then swap spot with current production docker
take down the swapped docker
The reason I keep some manual tasks (step 3) is because I want to handle build candidates in a kind of “the old way”. When I introduce some additional testing automation suites, I will probably enhance the whole thing.
By implementing these automated CI/CD workflows, I hope to achieve the following benefits:
Faster Feedback Cycles: Automated testing and deployment processes provide rapid feedback on code changes, allowing developers to quickly identify and resolve issues. I hope I won’t be the only developer forever on this project!
Early Detection of Issues: Continuous integration and testing catch defects early in the development cycle, preventing them from propagating to later stages and reducing the cost of fixing them.
Efficient and Reliable Deployments: Automated deployment processes ensure consistent and repeatable deployments, reducing the risk of human errors and minimizing downtime.
Improved Collaboration: Automated workflows facilitate collaboration among team members by providing a standardized and streamlined development process.
This is also something that will help me in my professional life – I kind of knew about it – but always relied on others to do it. So now, I will at least understand better what’s happening and the impact behind. I love learning!
And guess what: this approach aligns with industry best practices for modern software development and delivery, including:
Separation of Concerns: Separating the frontend, backend, and worker components into different projects promotes maintainability and scalability.
Continuous Integration: Regular integration of code changes into a shared repository, along with automated builds and tests, ensures early detection of issues and facilitates collaboration.
Continuous Deployment: Automated deployment processes enable frequent and reliable releases, reducing the risk of manual errors and accelerating time-to-market.
Test Automation: Comprehensive testing strategies, including unit tests, API tests, end-to-end tests, and load tests, ensure high-quality software and catch issues early in the development cycle.
Containerization: Using Docker containers for deployment ensures consistent and reproducible environments across development, testing, and production stages.
To me, this experiment demonstrates the importance of proactively addressing challenges related to project organization, package management, and automation in software development – earlier than later. With tools like Jenkins, Verdaccio, and Docker, I have laid the groundwork for a robust and scalable CI/CD pipeline that facilitates efficient collaboration, rapid iteration, and reliable software delivery.
As my project evolves, I plan to further enhance the automation processes, ensuring a smooth transition to continuous delivery and potential migration to cloud providers.
Let me try to explain to you, what to my taste is characteristic for all intelligent thinking. It is, that one is willing to study in depth an aspect of one’s subject matter in isolation for the sake of its own consistency, all the time knowing that one is occupying oneself only with one of the aspects. We know that a program must be correct and we can study it from that viewpoint only; we also know that it should be efficient and we can study its efficiency on another day, so to speak. In another mood we may ask ourselves whether, and if so: why, the program is desirable. But nothing is gained—on the contrary!—by tackling these various aspects simultaneously. It is what I sometimes have called “the separation of concerns”, which, even if not perfectly possible, is yet the only available technique for effective ordering of one’s thoughts, that I know of. This is what I mean by “focusing one’s attention upon some aspect”: it does not mean ignoring the other aspects, it is just doing justice to the fact that from this aspect’s point of view, the other is irrelevant. It is being one- and multiple-track minded simultaneously.
In the world of software development, my quest to understand continuous deployment led me down an intriguing path. It all began with a burning desire to unravel the complexities of continuous deployment while steering clear of expensive cloud hosting services. And that’s when my DIY GitHub Webhook Server project came to life.
The Genesis
Imagine being in my shoes—eager to dive deeper into the continuous deployment process. But I didn’t want to rely on pricey cloud solutions. So, I set out to craft a DIY GitHub Webhook Server capable of handling GitHub webhooks and automating tasks upon code updates—right from my local machine. Or any machine for that matter.
The Vision
Let’s visualize a scenario with a repository—let’s call it “MyAwesomeProject”— sitting in Github, and you are somewhere in a remote cabin with okey / dicey internet access. All you have is your laptop (let’s make it a Chromebook!!). You want to code snugly, and you want to update your server that sits at home. But… You don’t WANT to remote in your server. You want it to be automatic. Like magic.
You would have to be prepared. You would clone my repo, configure your server (including port forwarding), and maybe use something like no-ip.com so you have a “fixed” URL to use your webhook with. Then:
Configuring Your Repository: Start by defining the essential details of “MyAwesomeProject” within the repositories.json file—things like secretEnvName, path, webhookPath, and composeFile.
Setting Up Your GitHub Webhook: Head over to your “MyAwesomeProject” repository on GitHub and configure a webhook. Simply point the payload URL to your server’s endpoint (e.g., http://your-ddns-domain.net/webhook/my-awesome-project).
Filtering Events: The server is smartly configured to respond only to push events occurring on the ‘main’ branch (refs/heads/main). This ensures that actions are triggered exclusively upon successful pushes to this branch.
Actions in Motion: Upon receiving a valid push event, the server swings into action—automatically executing a git pull on the ‘main’ branch of “MyAwesomeProject.” Subsequently, Docker containers are rebuilt using the specified docker-compose.yml file.
So, there you have it—a simplified solution for automating your project workflows using GitHub webhooks and a self-hosted server. But before we sign off, let’s talk security.
For an added layer of protection, consider setting up an Nginx server with a Let’s Encrypt SSL certificate. This secures the communication channel between GitHub and your server, ensuring data integrity and confidentiality.
While this article delves into the core aspects of webhook configuration and automation, diving into SSL setup with Nginx warrants its own discussion. Stay tuned for a follow-up article that covers this crucial security setup, fortifying your webhook infrastructure against potential vulnerabilities.
Through this journey of crafting my DIY GitHub Webhook Server, I’ve unlocked a deeper understanding of continuous deployment. Setting up repositories, configuring webhooks, and automating tasks upon code updates—all from my local setup—has been an enlightening experience. And it’s shown me that grasping the nuances of continuous deployment doesn’t always require expensive cloud solutions.