How do you do, fellow reader? My name is Aleksei, I'm a newbie here at Commmune and I'm writing this article on behalf of the Productivity team. We've just started our journey into the great automated and productive future, and here's our first task - replacing the old deployment flow. And as a first step, we are vivisecting the current "CI" workflow based on GitHub Actions in order to deliver every PR to prod as soon as it has passed all checks and auto tests.
Prerequisites
Our project is TypeScript monorepo with a file structure like this:
client
Next app + express-based static client
server
: internal API server for the client
public-api
: public API server
shared
: shared code
db-layer
: db connection related code
We don't use a build system and manage everything using npm, standard Next compiler, and tsc for builds.
Old workflow
Our old workflow was pretty simple with a run on every push to PR or on commit to the master and develop branch with the single job under the hood:
name: CI on: pull_request: types: [opened, synchronize, reopened, ready_for_review] push: branches: [develop, master] concurrency: group: ${{ github.workflow }}-${{ github.ref }} cancel-in-progress: true jobs: static-code-analysis: runs-on: ubuntu-latest if: github.event.pull_request.draft == false steps: - uses: actions/checkout@v3 - uses: actions/setup-node@v3 - name: Cache node modules id: cache-npm uses: actions/cache@v3 env: cache-name: cache-node-modules with: path: | **/node_modules key: ${{ runner.os }}-build-${{ env.cache-name }}-${{ hashFiles('**/package-lock.json') }} restore-keys: | ${{ runner.os }}-build-${{ env.cache-name }}- ${{ runner.os }}-build- ${{ runner.os }}- - if: ${{ steps.cache-npm.outputs.cache-hit != 'true' }} name: Install dependencies on all repositories run: npm run all-install - name: Test client run: npm --prefix ./client run test - name: Test server run: npm --prefix ./server run test - name: Lint and type check run: npm run all-lint - name: Format check run: npm run check:prettier - name: Spell check run: npm run check:cspell
As you can see there are several problems with this workflow:
- Static analysis for the entire project, no matter what was changed
- The same with the unit tests, the workflow will run the tests for the
client
even when you change the README file or theserver
- No e2e and integration tests run for the application, closed and public API. As a result, the quality of the code merged into the development branch is low, and we often have to revert or hotfix.
- Sequential run of the tests and linting
New unit tests and linting workflow
The brand-new concept was made to resolve the issues above. We've split our workflow by the directory structure into five and the one to skip required jobs for the files that are not present in the other workflows. So the typical workflow looks like this:
name: Client CI env: app_name: client on: pull_request: types: [opened, synchronize, reopened, ready_for_review] paths: - 'client/**' - '!**.md' - 'shared/**' push: paths: - 'client/**' - '!**.md' - 'shared/**' branches: [develop] concurrency: group: ${{ github.workflow }}-${{ github.ref }} cancel-in-progress: true jobs: lint-and-test: runs-on: ubuntu-latest if: github.event.pull_request.draft == false steps: - uses: actions/checkout@v3 - name: Bootstrap uses: './.github/actions/bootstrap' - name: Formatting uses: './.github/actions/format' with: dir: ${{env.app_name}} - name: Test working-directory: ${{ env.app_name }} run: npm run test
graph LR; job[Bootstrap] --> install-deps[Install Dependencies] job --> cache[Restore Cache] job --> timezone[Set Timezone] job --> setup-node[Setup node.js]
graph LR; job[Formatting] --> lint[ESlint + Typechecking] job --> prettier[Prettier] job --> cspell[CSpell]
For the shared
and db-layer
folders, we just omit the testing part.
E2E workflow
We used a matrix approach to run our Cypress tests in parallel. So it looks like this (some unimportant parts were changed or removed):
name: E2E tests on: pull_request_review: types: [submitted] push: branches: [develop] concurrency: group: ${{ github.workflow }}-${{ github.ref }} cancel-in-progress: true jobs: # To reduce costs, we run e2e tests just before merging to develop branch after 2 PR approvals should-run: if: github.event.review.state == 'approved' runs-on: ubuntu-latest outputs: approved: ${{ steps.approved.outputs.approved }} steps: - uses: './.github/actions/approvals' id: approved # Before the e2e run we don't have any images or artifacts of our project. # So we build our Next app here and reuse it in our matrix jobs build: runs-on: ubuntu-latest needs: should-run if: ${{needs.should-run.outputs.approved == 'true'}} steps: - uses: actions/checkout@v3 - name: Bootstrap uses: './.github/actions/bootstrap' - name: Build client run: npm run build - uses: actions/upload-artifact@v3 with: name: dist path: client/dist if-no-files-found: error e2e: strategy: fail-fast: false matrix: spec: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10] runs-on: ubuntu-latest needs: build steps: - uses: actions/checkout@v3 - name: Bootstrap uses: './.github/actions/bootstrap' # For correct display of Japanese characters in Cypress - name: Install fonts-noto run: sudo apt install fonts-noto shell: bash # We don't use `service` feature of the GitHub actions to run e.g. `mysql` because using `service` is not as configurable as a docker-compose file and since we already use it for local development, just running the containers is pretty easy. - name: Run containers run: docker compose up # Run in background and save logs to the file - name: Run server run: npm run server > server-log${{matrix.spec}} & - uses: actions/download-artifact@v3 with: name: dist path: client/dis # Same with the client - name: Run client run: npm run client > client-log${{matrix.spec}} & - name: Wait for server and client run: | npm run wait -- http://localhost:15000/echo -t 120000 npm run wait -- http://localhost:3000 -t 60000 # Here works the script that splits all our test files into 10 parts # and pass them as a list of parameters to the cypress run --spec - name: Run tests uses: './.github/actions/e2e' with: dir: application split-into: 10 file-num: ${{matrix.spec}} # Save screenshots of failed tests and logs of the client and server - name: Save screenshots if: failure() uses: actions/upload-artifact@v3 with: name: screenshots if-no-files-found: ignore path: e2e/cypress/screenshots/** - name: Save logs if: failure() uses: actions/upload-artifact@v3 with: name: logs path: ./*-log* e2e-success: runs-on: ubuntu-latest needs: e2e steps: - name: All tests ok if: ${{ !(contains(needs.*.result, 'failure')) }} run: exit 0 - name: Some tests failed if: ${{ contains(needs.*.result, 'failure') }} run: exit 1
Unfortunately you can't specify only e2e
job as a branch protection check, because it uses a matrix and basically is not a single job but 10 different and you need to list all of them in the rule. Luckily there is a e2e-success
job (thanks stack overflow) so we can just use it as a step and don't worry about matrix expansion in the future.
Also note that for correct display of Japanese characters using Cypress, you need to install Noto fonts, as the default Ubuntu used in GitHub Actions doesn't include them.
Money-saving technics
ChatGPT knows
Use it wisely, almost all the time you can ask it something like "please make it simple" or "simplify" and the suggested solution becomes much better. Reformulate your request if the answers seem strange or inappropriate. And of course don't forget to use your own head :smile:
We've used it to generate some useful bash scripts for splitting the tests into chunks and counting the average number of PRs the team merges each month, and in my opinion it's one of the best things you can use ChatGPT for. The generated scripts are very precise and customizable, and Chat definitely knows more git commands than any of us.
For example here is the generated script for the number of PRs (counted by the squash commits to the develop branch).
# Set the start and end dates for the period to calculate the average commits start_date="2022-03-08" end_date="2023-03-08" # Get the total number of commits during the period total_commits=$(git log --oneline --after="$start_date" --before="$end_date" | wc -l) # Calculate the number of weeks between the start and end dates start_timestamp=$(gdate -d "$start_date" +%s) end_timestamp=$(gdate -d "$end_date" +%s) total_weeks=$(( ($end_timestamp - $start_timestamp) / (7 * 24 * 3600) )) # Calculate the average commits per week average_commits=$(echo "scale=2; $total_commits / $total_weeks" | bc) # Print the result echo "The average number of commits per week between $start_date and $end_date is $average_commits."
And another one for splitting the tests.
find cypress/tests -type f -name "*.spec.ts" > $input_file split --lines=$lines_per_file $input_file "./tmp/spec_" --numeric-suffixes=1
We've also tried to ask ChatGPT to improve the workflow itself and some complicated questions about GitHub Actions, but since Chat has limited knowledge, it can't help you with the latest changes.
Combine small jobs
You can combine linting and testing jobs into one, because linting itself doesn't take much time. But you also have to do a checkout and install/restore dependencies from cache if you run it in a separate job. So basically combining them doesn't make much of a difference in time, but it does allow you to save some money. It's also applicable to any small job, even if it does the task multiple times.
Move everything outside of the matrix
Especially builds and other expensive steps, but also don't forget about the previous point
Use extensions and reuse existing Actions
For VS Code you can use an GitHub Actions extension that will show you the errors in your workflow before the run (hello Jenkins).
And of course don't overthink your own solution if it exists on the market, just use it.
Problems
Lots of extensions use deprecated libs
A serious problem I faced during development was that many useful actions available on the market don't get updates for months or even years, and there are no fresh alternatives. So be sure to check the warnings on GitHub and be ready to replace them with your own solutions.
Path filtering and required jobs
Don't forget to use special workflows for skipped but required checks or you can't merge your PR that changes something beyond the path filtering conditions in your workflows.
Testing might be expensive
You can enjoy the power of free GitHub Actions access on a personal account, you can easily repeat the structure of your project and run similar workflows for testing purposes on the playground. It will also allow you to decrease feedback loop time.
Next steps
We have done a lot of work to improve our workflow, but I know we can do better, especially in execution time, so here is what I suggest to do in the future.
Optimize e2e tests run
As you saw in the script, our current e2e test splitting is very simple, it just uses find
and split
without any extra optimization logic.
And it's not the best solution because the execution time per job can vary significantly. So we can analyze Cypress logs and execution time and make the test distribution more rational.
Use a modern build system for monorepo
Now our dependency management is kind of chaotic, our packages are split between package.json files and some packages uses global dependencies with the local ones, so you need to install everything just to be sure that you can build and run e.g. only client. Also, it will be nice to have caching of the dependencies in the local environment and between the builds in the CI in a more sophisticated way. We can achieve this by using some of the modern monorepo build systems like nx or Turborepo.
Optimization of run services and containers
Now our dependency management is kind of messy, our packages are split between package.json files and some packages use global dependencies with the local ones, so you have to install everything just to be sure you can build and run e.g. just the client. It would also be nice to cache dependencies in the local environment or even use remote caching and between builds in the CI in a more sophisticated way. We can achieve this by using some of the modern monorepo build systems like nx or Turborepo.
In conclusion
As you can see we still have a lot of work to do and our CI is far from perfect, so these changes are just the start of something good. We will try to keep you updated about interesting findings during the use of GitHub Actions usage and about further improvements, thank you for reading this far!
If you want to help us with this difficult task and are not afraid of the actively changing startup environment, please check out our open positions. We are trying to make our environment more international and foreigner-friendly, and at this stage, your impact will be enormous. Let's build a great culture together!
commmune-careers.studio.site https://speakerdeck.com/commmune/commmune-introduction-engspeakerdeck.com