2024-01-11 FanApps Plāns

 

Projekta apraksts

Projekta objekts:

 

Github: https://github.com/asya-ai/fanapps-quizzes.git

No FanApps puses kontaktpersona par projektu: arturs@fanapps.io, +37128263904 linda@fanapps.io, +37125997012

Ground truth paraugi kādā līmenī nepieciešamas atbildes: http://share.yellowrobot.xyz/quick/2024-1-11-E7B7165D-FFA0-4A45-A8EE-EB7D282C860F.pdf

Figma Dizaini gala sistēmai FanApps pusē: https://www.figma.com/proto/rARhemzAXwxEk8nQNzRdmN/UseAward?page-id=611%3A19510&type=design&node-id=1361-20119&viewport=4970%2C-896%2C0.06&t=Lg5h2E0cKkcNB8qg-1&scaling=scale-down-width&starting-point-node-id=1361%3A20119&mode=design

 

Plāns

Plāns:

  1. FastAPI Swagger based API

    1. Deadline: 19. janvāris

  2. PDF Failu pievienošana, lai ģenerētu quizzes (testēšanai FC rīga lapas saglabāt kā PDF)

    1. Deadline: 19. janvāris

  3. Pirmā versija, kas ģenerē quizzes

    1. Deadline: 26. janvāris

  4. URL pievienošana HTML formātā

    1. Deadline: 2. februāris

  5. Sagatavot sistēmu, lai spēj automātiski paņemt jaunākos kluba datus API, lai varētu pa jaunu uzģerēt aptaujas (sākotnēji FC Rīga), katram API būtu jāpielāgo datu sagatavošana. Jānoskaidro precīzi un jāvienojas kā dati nonāks

    1. Deadline:

        1. februāris

 

 

API funkcijas

API funkcijas (autentifikācija ar API key)

  1. add_client (api_key: str, name: str) -> client_uuid: str

  2. delete_clients (api_key: str, client_uuids: str[])

  3. list_clients(api_key: str) -> Client[]

  4. add_sources_file (api_key: str, client_uuid, file:FileUpload, tags: str[]) -> fact_uuid ^ PDF formātā sākotnēji

  5. add_sources_url (api_key: str, client_uuid, url: str, tags: str[]) -> fact_uuid

  6. list_sources(api_key: str,client_uuid) -> Source[]

  7. update_sources(api_key: str,client_uuid, sources: Source[])

  8. delete_sources(api_key: str,client_uuid, sources_uuids: str[])

  9. add_fact (api_key: str, client_uuid, text:str, tags: str[]) -> fact_uuid ^ Lai no ārējiem API var glabāt jaunāko informāciju

  10. list_facts(api_key: str, client_uuid) -> Fact[]

  11. update_facts(api_key: str, client_uuid, Fact[])

  12. delete_facts(api_key: str, client_uuid, facts_uuids: str[])

  13. generate_questions(api_key: str, client_uuid, language_code:str, tags: str[], callback_url:str) -> batch_uuid

  14. list_questions(api_key: str, client_uuid, batch_uuid) -> Question[] ^ gatavi pēc 5min

  15. update_questions(api_key: str, client_uuid, questions: Question[]) ^ Nepieciešams, lai mēs reģistrētu izmantototos jautājumus un novērstu to atkārtošanos

  16. list_question_batches(api_key: str, client_uuid) -> Batch[]

  17. list_questions(api_key: str, client_uuid, batch_uuid) -> Question[]

  18. delete_questions(api_key: str, client_id, questions_uuids: str[])

Komentāri:

Database structure

1
0..*
1
0..*
1
0..*
1
0..*
1
0..*
1
0..*
1
0..*
1
0..*
organizations
organization_id: PK, SERIAL
organization_uuid: UUIDv4,
api_key: UUIDv4,
is_deleted: BOOLEAN,
created: TIMESTAMP,
modified: TIMESTAMP,
organization_name: VARCHAR(512)
clients
client_id: PK, SERIAL
organization_id: FK, organizations.organization_id
client_uuid: UUIDv4,
is_deleted: BOOLEAN,
created: TIMESTAMP,
modified: TIMESTAMP,
client_name: VARCHAR(512)
sources
source_id: PK, SERIAL
client_id: FK, clients.client_id
source_uuid: UUIDv4,
source_text: TEXT,
is_deleted: BOOLEAN,
created: TIMESTAMP,
modified: TIMESTAMP,
source_type: VARCHAR(8) : ENUM('pdf', 'url', 'api'),
source_url: VARCHAR(512)
language_code: VARCHAR(2) : ENUM('lv', 'en', 'es', 'pt'),
facts
fact_id: PK, SERIAL
client_id: FK, clients.client_id
source_id: FK, sources.source_id
fact_uuid: UUIDv4,
fact_text: TEXT,
fact_text_en: TEXT,
fact_embedding: pgvector.vector,
is_deleted: BOOLEAN,
created: TIMESTAMP,
modified: TIMESTAMP,
language_code: VARCHAR(2) : ENUM('lv', 'en', 'es', 'pt'),
tags
tag_id: PK, SERIAL
tag_embedding: pgvector.vector,
is_deleted: BOOLEAN,
created: TIMESTAMP,
modified: TIMESTAMP,
tag_name: VARCHAR(512)
tags_in_facts
tag_in_fact_id: PK, SERIAL
fact_id: FK, facts.fact_id
tag_id: FK, tags.tag_id
is_deleted: BOOLEAN,
created: TIMESTAMP,
modified: TIMESTAMP,
sources_in_facts
source_in_fact_id: PK, SERIAL
fact_id: FK, facts.fact_id
source_id: FK, sources.source_id
is_deleted: BOOLEAN,
created: TIMESTAMP,
modified: TIMESTAMP,
question_batches
question_batch_id: PK, SERIAL
client_id: FK, clients.client_id
question_batch_uuid: UUIDv4,
is_deleted: BOOLEAN,
created: TIMESTAMP,
modified: TIMESTAMP,
language_code: VARCHAR(2) : ENUM('lv', 'en', 'es', 'pt'),
tags: VARCHAR(512)
callback_url: VARCHAR(512)
questions
question_id: PK, SERIAL
question_batch_id: FK, question_batches.question_batch_id
question_uuid: UUIDv4,
question_text: TEXT,
question_embedding: pgvector.vector,
is_deleted: BOOLEAN,
created: TIMESTAMP,
modified: TIMESTAMP,
question_answers
question_answer_id: PK, SERIAL
question_id: FK, questions.question_id
question_answer_uuid: UUIDv4,
question_answer_text: TEXT,
is_correct: BOOLEAN,
order_number: INT,
is_deleted: BOOLEAN,
created: TIMESTAMP,
modified: TIMESTAMP,

DB izmantot PostgreSQL, piekļuve šeit:

Uzstādīt uz servera, AWS-10

 

Dummy lapa: https://fanapps.asya.ai

screen -rd fanapps

Uz port 8847

Piemēri ar PostgreSQL: http://share.yellowrobot.xyz/quick/2024-1-11-16AFD7E2-E83C-4A8C-8D5A-9178D4E586DB.html

DB objektiem izmanto SQLAlchemy ORM (drīkst arī SQL rakstīt sarežģītākos gadījumos, bet lielā daļā gadījumu var uzreiz iztikt ar iebūvētjām funkcijām), šeit piemēri: http://share.yellowrobot.xyz/quick/2024-1-11-4480FBDF-544B-4B7E-AE0E-AB581FFEEEB6.html

⚠️⚠️⚠️ Ņemiet par paraugu kodu no Eldigen (šobrīd jau ir PDF scraper un vairākas noderīgas klases), centieties nekopēt lieku kodu, pielāgojiet maksimāli https://github.com/asya-ai/eldigen-web

 

Question generation execution flow

Piemērs kā jautājumi tiek ģenerēti:

DBWorkerAPIUserDBWorkerAPIUserloop[wait until sources areprocessed]loop[wait until questions aregenerated]add_sources_filestore sourceprocess sourcestore factslist_sourcesgenerate_questionsstore batch requestprocess batch requestfind similar facts by embeddingsgenerate questions using LLMstore questionslist_question_batches

 

Saistītie dokumenti

API datu ievākšanai: https://kinexon-sports.com/products/engage/

Micro bets spēles laikā: https://profluence.com/nvenue-receives-investment-from-nba/?utm_source=substack&utm_medium=email