aboutsummaryrefslogtreecommitdiffstatshomepage
path: root/src/worker.c (unfollow)
Commit message (Collapse)Author
2023-07-18switch to JSON-RPC as message formatEgor Tensin
Instead of the weird `struct msg` I had, I switched to the JSON-RPC format. It's basically the same, but has a well-defined semantics in case of errors.
2023-07-09store process output in SQLiteEgor Tensin
2023-07-05worker: actually stay offlineEgor Tensin
I don't know what I was thinking, but contrary to my intention, the worker stayed connected to the server all the time.
2023-07-05tcp_server: keep track of client threadsEgor Tensin
This is a major change, obviously; brought to me by Valgrind, which noticed that we don't actually clean up after cimple-client threads. For a more thorough explanation, please see the added comment in tcp_server.c.
2023-07-04worker: close the leftover descriptorEgor Tensin
Thanks, Valgrind! As a note: if I think that Valgrind reports a false positive, chances are, it's not.
2023-07-04move custom message parsing to a separate moduleEgor Tensin
2023-07-04storage: mark completed runs as suchEgor Tensin
2023-06-30show git hash with --versionEgor Tensin
Also, use cmake's configure_file to build string constants in.
2023-06-13minor refactoringEgor Tensin
2023-06-13signal: remove the stupid add_to_event_loop wrapperEgor Tensin
2023-06-13event_loop: add event_loop_add_onceEgor Tensin
2023-06-13signal: refactoringEgor Tensin
2023-06-13use signalfd to stop on SIGTERMEgor Tensin
Is this an overkill? I don't know. The thing is, correctly intercepting SIGTERM (also SIGINT, etc.) is incredibly tricky. For example, before this commit, my I/O loops in server.c and worker.c were inherently racy. This was immediately obvious if you tried to run the tests. The tests (especially the Valgrind flavour) would run a worker, wait until it prints a "Waiting for a new command" line, and try to kill it using SIGTERM. The problem is, the global_stop_flag check could have already been executed by the worker, and it would hang forever in recv(). The solution seems to be to use signalfd and select()/poll(). I've never used either before, but it seems to work well enough - at least the very same tests pass and don't hang now.
2023-06-11msg: rework some APIsEgor Tensin
2023-05-15signal: refactoringEgor Tensin
2023-05-15minor refactoringEgor Tensin
2023-05-15signal: refactoring, add comments in tcp_server, etc.Egor Tensin
2023-05-15EINVAL means EINTR also?Egor Tensin
2023-05-15rework server-worker communicationEgor Tensin
OK, this is a major rework. * tcp_server: connection threads are not detached anymore, the caller has to clean them up. This was done so that the server can clean up the threads cleanly. * run_queue: simple refactoring, run_queue_entry is called just run now. * server: worker threads are now killed when a run is assigned to a worker. * worker: the connection to server is no longer persistent. A worker sends "new-worker", waits for a task, closes the connection, and when it's done, sends the "complete" message and waits for a new task. This is supposed to improve resilience, since the worker-server connections don't have to be maintained while the worker is doing a CI run.
2023-05-14command: adjust order of parameters to handlersEgor Tensin
2023-05-14process: add process_output_dumpEgor Tensin
2023-05-13ci_queue -> run_queueEgor Tensin
Also, some minor refactoring.
2023-05-13command: refactoringEgor Tensin
2023-05-13best practices & coding style fixesEgor Tensin
* I don't really need to declare all variables at the top of the function anymore. * Default-initialize variables more. * Don't set the output parameter until the object is completely constructed.
2023-05-13add command module to handle request-response communicationsEgor Tensin
2023-05-06add a TODO noteEgor Tensin
2023-05-06shut down server/workers gracefully on SIGTERMEgor Tensin
2023-05-06get rid of __attribute__((constructor))Egor Tensin
Explicit is better than implicit.
2023-04-29make struct ci_queue_entry opaqueEgor Tensin
2023-04-29make struct worker opaqueEgor Tensin
2023-04-27rename commandsEgor Tensin
2022-12-02add copyright noticesEgor Tensin
2022-09-08log: refactoringEgor Tensin
2022-09-08sanitize #include-sEgor Tensin
2022-08-28update command namesEgor Tensin
2022-08-28worker: fix a crashEgor Tensin
Found when running in Docker.
2022-08-28make proper "error" messagesEgor Tensin
Previously, the client had no way to distinguish errors from succesful calls.
2022-08-28make compilers happierEgor Tensin
2022-08-28holy crap, it actually kinda works nowEgor Tensin
Previously, I had a stupid system where I would create a thread after every accept(), and put worker descriptors in a queue. A special "scheduler" thread would then pick them out, and give out jobs to complete. The problem was, of course, I couldn't conveniently poll job status from workers. I thought about using poll(), but that turned out to be a horribly complicated API. How do I deal with partial reads, for example? I don't honestly know. Then it hit me that I could just use the threads that handle accept()ed connections as "worker threads", which would synchronously schedule jobs and wait for them to complete. This solves every problem and removes the need for a lot of inter-thread synchronization magic. It even works now, holy crap! You can launch and terminate workers at will, and they will pick up new jobs automatically. As a side not, msg_recv_and_handle turned out to be too limiting and complicated for me, so I got rid of that, and do normal msg_recv/msg_send calls.
2022-08-26add check_errno macroEgor Tensin
2022-08-26fix pthread error handlingEgor Tensin
pthread functions return positive error codes.
2022-08-26worker: allow graceful shutdownsEgor Tensin
Well, maybe "graceful" is a strong word, but now you _can_ do ./server & ./worker & ./client ci_run URL REV && kill "$( pidof worker )" and the worker will wait for the CI run to complete.
2022-08-26worker: capture process outputEgor Tensin
2022-08-26add some more codeEgor Tensin
This adds a basic "worker" program. You can now do something like ./server & ./worker & ./client ci_run URL REV and the server should pass a message to worker, after which it should clone the repository at URL, checkout REV, and try to run the CI script. It's extremely unfinished: I need to sort out the graceful shutdown, how the server manages workers, etc.