[CentOS-devel] Introducing the ARC sub-team in CPE - and first research topic

Mon Jan 18 15:24:51 UTC 2021
Pierre-Yves Chibon <pingou at pingoured.fr>

Good Morning Everyone,

While planning work, the CPE team has realized that a number of our initiatives
actually start with a research phase to find the most appropriate technical
solution.
This leads to some issues with planning as without knowing the technical
solution we want to take, it's hard to evaluate the amount of work needed and
thus the time it'll take to do it.

In order to help with this, we're creating a small sub-team in CPE, called the
ARC team for Advance Reconaissance Crew*.
The goal of this team will be to investigate what we believe to be the possible
technical solutions for initiatives and advise the team on what they believe
would be the appropriate solution.
To this end, we will reach out when we start looking for ideas as you may have
ideas that we did not think about.

The first investigation, led by Will Woods, Mark O'Brien and I, will be around
datanommer and datagrepper.

datanommer is an application listening to fedmsg and filling a (postgresql)
database with all the messages passing on the bus.
datagrepper is a web application exposing these messages and offering a way to
filter or search them.
    available at: https://apps.fedoraproject.org/datagrepper/

Currently our ideas are:
- for datanommer:
    - port it to fedora-messaging
    - adjust it to whichever solution we chose to replace datagrepper

- for datagrepper:
    - keep it as is
    - Replace by
        - postgres https://postgrest.org/
        - prest https://github.com/prest/prest
        - kinto https://docs.kinto-storage.org/en/stable/
        - Swagger/OpenAPI https://swagger.io/ 
    - Add support for Graphql

- for the postgresql server
    - Split messages per year in different table
        - Unite them using a postgresql view
    - Kick out the old messages per year
        - Keep the current year + n-1 in the current DB
        - Kick the other to another DB?
        - Kick the other to a tarball somewhere?
        - Output the database daily dump to file / year
    - TimescaleDB a postgresql plugin for time-series data
        - https://alibaba-cloud.medium.com/postgresql-time-series-database-plug-in-timescaledb-deployment-practices-6a07e246eb0d 
        - https://dev.t-matix.com/blog/postgresql-as-a-time-series-database/ 
        - https://docs.timescale.com/latest/introduction 
    - Make the msg field in the message table be a JSON field

Would you have any other ideas of things we could look at?


Looking forward for your input,

Thanks,
Pierre, Will and Mark


* Our notes and documentation are hosted at:
  https://fedora-arc.readthedocs.io/en/latest/index.html
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 195 bytes
Desc: not available
URL: <http://lists.centos.org/pipermail/centos-devel/attachments/20210118/994815ab/attachment.sig>