Refine
Document Type
- Article (4)
Language
- English (4)
Has Fulltext
- yes (4)
Is part of the Bibliography
- no (4)
Institute
NATURAL LANGUAGE (NL) IS A PROMISING ALTERNATIVE INTERFACE TO DATABASE MANAGEMENT SYSTEMS (DBMSs) BECAUSE IT ENABLES NON-TECHNICAL USERS TO FORMULATE COMPLEX QUESTIONS. RECENTLY, DEEP LEARNING HAS GAINED TRACTION FOR TRANSLATING NATURAL LANGUAGE TO SQL. HOWEVER, THE CORE PROBLEM WITH EXISTING DEEP LEARNING APPROACHES IS THAT THEY REQUIRE AN ENORMOUS AMOUNT OF MANUALLY CURATED TRAINING DATA IN ORDER TO PROVIDE ACCURATE TRANSLATIONS. WE PRESENT DBPAL THAT USES A NOVEL TRAINING PIPELINE TO LEARN NL2SQL INTERFACES WHICH SYNTHESIZES TRAINING DATA AND, THUS, DOES NOT RELY ON MANUALLY CURATED TRAINING DATA.
WE PRESENT OUR VISION OF OMNISCIENTDB, A NOVEL DATABASE THAT LEVERAGES THE IMPLICITLY STORED KNOWLEDGE IN LARGE LANGUAGE MODELS TO AUGMENT DATA SETS FOR ANALYTICAL QUERIES OR MACHINE LEARNING TASKS. OMNISCIENTDB EMPOWERS USERS TO AUGMENT DATA SETS BY MEANS OF SIMPLE SQL QUERIES AND THUS HAS THE POTENTIAL TO DRAMATICALLY REDUCE THE MANUAL OVERHEAD ASSOCIATED WITH DATA INTEGRATION. IT USES AUTOMATIC PROMPT ENGINEERING TO CONSTRUCT APPROPRIATE PROMPTS FOR GIVEN SQL QUERIES AND PASSES THEM TO A LARGE LANGUAGE MODEL LIKE GPT-3 TO CONTRIBUTE ADDITIONAL DATA, AUGMENTING THE EXPLICITLY STORED DATA. OUR INITIAL EVALUATION DEMONSTRATES THE GENERAL FEASIBILITY OF OUR VISION, EXPLORES DIFFERENT PROMPTING TECHNIQUES IN GREATER DETAIL, AND POINTS TOWARDS FUTURE RESEARCH.
RECENTLY, A NEW CLASS OF SYSTEMS FOR SHARED AND COLLABORATIVE DATA MANAGEMENT HAS GAINED MORE AND MORE TRACTION. IN CONTRAST TO CLASSICAL DATA BASE MANAGEMENT SYSTEMS (DBMS), SYSTEMS FOR SHARED DATA NEED TO PROVIDE ADDITIONAL GUARANTEES TO ENSURE THE INTEGRITY OF DATA AND TRANSACTION EXECUTION. IN THIS PAPER, WE PRESENT TRUSTDBLE, A NEW DBMS THAT EXTENDS THE ACID PROPERTIES (I.E., ATOMICITY, CONSISTENCY, ISOLATION, DURABILITY) USED BY CLASSICAL DBMSS WITH A NEW VERIFIABILITY COMPONENT TO ADDRESS THESE NEW REQUIREMENTS.