cimple - My little CI system (hopefully)

	Commit message (Collapse)	Author	Age
*	switch to JSON-RPC as message format	Egor Tensin	2023-07-18
\| \| \| \| \| \|	Instead of the weird `struct msg` I had, I switched to the JSON-RPC format. It's basically the same, but has a well-defined semantics in case of errors.
*	store process output in SQLite	Egor Tensin	2023-07-09
\|
*	test: verify that added runs are in the database	Egor Tensin	2023-07-08
\| \| \| \| \|	And that they're marked as finished. It immediately exposed some concurrency bugs, so some locking has been fixed.
*	server: fix a possible leak	Egor Tensin	2023-07-07
\|
*	tcp_server: keep track of client threads	Egor Tensin	2023-07-05
\| \| \| \| \| \| \|	This is a major change, obviously; brought to me by Valgrind, which noticed that we don't actually clean up after cimple-client threads. For a more thorough explanation, please see the added comment in tcp_server.c.
*	sanitize #include-s	Egor Tensin	2023-07-04
\|
*	move custom message parsing to a separate module	Egor Tensin	2023-07-04
\|
*	storage_sqlite: refactoring	Egor Tensin	2023-07-04
\|
*	storage: mark completed runs as such	Egor Tensin	2023-07-04
\|
*	storage: requeue old runs from storage on startup	Egor Tensin	2023-07-04
\|
*	tcp_server: always clean up connection descriptors	Egor Tensin	2023-07-04
\|
*	sqlite: store new runs in SQLite	Egor Tensin	2023-07-04
\|
*	storage_sqlite: refactoring	Egor Tensin	2023-07-04
\|
*	minor refactoring	Egor Tensin	2023-06-13
\|
*	signal: remove the stupid add_to_event_loop wrapper	Egor Tensin	2023-06-13
\|
*	server: handle disconnected workers gracefully	Egor Tensin	2023-06-13
\|
*	signal: refactoring	Egor Tensin	2023-06-13
\|
*	use signalfd to stop on SIGTERM	Egor Tensin	2023-06-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Is this an overkill? I don't know. The thing is, correctly intercepting SIGTERM (also SIGINT, etc.) is incredibly tricky. For example, before this commit, my I/O loops in server.c and worker.c were inherently racy. This was immediately obvious if you tried to run the tests. The tests (especially the Valgrind flavour) would run a worker, wait until it prints a "Waiting for a new command" line, and try to kill it using SIGTERM. The problem is, the global_stop_flag check could have already been executed by the worker, and it would hang forever in recv(). The solution seems to be to use signalfd and select()/poll(). I've never used either before, but it seems to work well enough - at least the very same tests pass and don't hang now.
*	msg: rework some APIs	Egor Tensin	2023-06-11
\|
*	signal: refactoring	Egor Tensin	2023-05-15
\|
*	minor refactoring	Egor Tensin	2023-05-15
\|
*	signal: refactoring, add comments in tcp_server, etc.	Egor Tensin	2023-05-15
\|
*	EINVAL means EINTR also?	Egor Tensin	2023-05-15
\|
*	rework server-worker communication	Egor Tensin	2023-05-15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	OK, this is a major rework. * tcp_server: connection threads are not detached anymore, the caller has to clean them up. This was done so that the server can clean up the threads cleanly. * run_queue: simple refactoring, run_queue_entry is called just run now. * server: worker threads are now killed when a run is assigned to a worker. * worker: the connection to server is no longer persistent. A worker sends "new-worker", waits for a task, closes the connection, and when it's done, sends the "complete" message and waits for a new task. This is supposed to improve resilience, since the worker-server connections don't have to be maintained while the worker is doing a CI run.
*	command: adjust order of parameters to handlers	Egor Tensin	2023-05-14
\|
*	msg: add functions for one-off communication	Egor Tensin	2023-05-14
\|
*	ci_queue -> run_queue	Egor Tensin	2023-05-13
\| \| \| \|	Also, some minor refactoring.
*	command: refactoring	Egor Tensin	2023-05-13
\|
*	best practices & coding style fixes	Egor Tensin	2023-05-13
\| \| \| \| \| \| \| \|	* I don't really need to declare all variables at the top of the function anymore. * Default-initialize variables more. * Don't set the output parameter until the object is completely constructed.
*	add command module to handle request-response communications	Egor Tensin	2023-05-13
\|
*	ci_queue: rename a couple of functions	Egor Tensin	2023-05-12
\|
*	shut down server/workers gracefully on SIGTERM	Egor Tensin	2023-05-06
\|
*	get rid of __attribute__((constructor))	Egor Tensin	2023-05-06
\| \| \| \|	Explicit is better than implicit.
*	make struct ci_queue_entry opaque	Egor Tensin	2023-04-29
\|
*	make struct server opaque	Egor Tensin	2023-04-29
\|
*	make struct tcp_server opaque	Egor Tensin	2023-04-29
\|
*	fix a typo	Egor Tensin	2023-04-27
\|
*	rename commands	Egor Tensin	2023-04-27
\|
*	add copyright notices	Egor Tensin	2022-12-02
\|
*	create SQLite database on startup	Egor Tensin	2022-09-11
\|
*	log: refactoring	Egor Tensin	2022-09-08
\|
*	sanitize #include-s	Egor Tensin	2022-09-08
\|
*	update command names	Egor Tensin	2022-08-28
\|
*	server: notify workers about requeued jobs	Egor Tensin	2022-08-28
\| \| \| \|	This allows free workers to pick up jobs after dead workers.
*	server: notify all threads about shutting down	Egor Tensin	2022-08-28
\| \| \| \| \| \| \|	The problem is pthread_cond_destroy is unsafe to call if there're threads waiting in pthread_cond_wait. I'm not sure this fix is enough: what if the "broadcast" doesn't reach the threads until we call pthread_cond_destroy? Does it even work that way? Idk
*	make proper "error" messages	Egor Tensin	2022-08-28
\| \| \| \| \|	Previously, the client had no way to distinguish errors from succesful calls.
*	make compilers happier	Egor Tensin	2022-08-28
\|
*	server: more logging	Egor Tensin	2022-08-28
\|
*	holy crap, it actually kinda works now	Egor Tensin	2022-08-28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously, I had a stupid system where I would create a thread after every accept(), and put worker descriptors in a queue. A special "scheduler" thread would then pick them out, and give out jobs to complete. The problem was, of course, I couldn't conveniently poll job status from workers. I thought about using poll(), but that turned out to be a horribly complicated API. How do I deal with partial reads, for example? I don't honestly know. Then it hit me that I could just use the threads that handle accept()ed connections as "worker threads", which would synchronously schedule jobs and wait for them to complete. This solves every problem and removes the need for a lot of inter-thread synchronization magic. It even works now, holy crap! You can launch and terminate workers at will, and they will pick up new jobs automatically. As a side not, msg_recv_and_handle turned out to be too limiting and complicated for me, so I got rid of that, and do normal msg_recv/msg_send calls.
*	server: shutting down more gracefully	Egor Tensin	2022-08-28
\|