RPE Database Example Queries

Experimental data analysis (Oct / Nov 2023)

Authors
Affiliations

Cathrynne Henshall

Michael J. Booth

Published

Tue Nov 12, 2024 11:38 AM

Abstract

Querying the data from the database

Purpose of This Notebook

This notebook serves as an exploratory tool for examining the log file data from the horse behavioural experiments conducted in October and November 2023 that are loaded into a local DuckDB database using logfile-to-database-RPE.ipynb.

It facilitates loading and querying the data from the database using some example SQL queries.

Setup project & directories

This is the init_notebook_mode cell from ITables v2.1.4
(you should not see this message - is your notebook trusted?)
INFO     | Data directory: /Users/mjboothaus/code/github/databooth/horse-logic/data
INFO     | Data directory purpose: Parent directory for raw and processed data
INFO     | Sql directory: /Users/mjboothaus/code/github/databooth/horse-logic/sql
INFO     | Sql directory purpose: Store SQL scripts
INFO     | Output directory: /Users/mjboothaus/code/github/databooth/horse-logic/notebooks/results/RPE
INFO     | Output directory purpose: Store output files and results by experiment type
INFO     | Logfiles directory: /Users/mjboothaus/code/github/databooth/horse-logic/data/results/zips/data_17Jan2020_email_hillydale_equine
INFO     | Logfiles directory purpose: Store for the raw log files
INFO     | Notebooks directory: /Users/mjboothaus/code/github/databooth/horse-logic/notebooks
INFO     | Notebooks directory purpose: Jupyter notebooks for performing analysis
INFO     | Database file path: /Users/mjboothaus/code/github/databooth/horse-logic/data/Experiments_RPE_2023_Q4.ddb
INFO     | Database purpose: Main project databases (outputs) by experiment type
INFO     | Project initialised (RPE): config defined in project_config.yaml
INFO     | Connected to database: /Users/mjboothaus/code/github/databooth/horse-logic/data/Experiments_RPE_2023_Q4.ddb
INFO     | DuckDB version: v1.1.2

Perform some example database queries (local DuckDB)

Show the tables in the database - should be Events, Experiments and Trials.

# Cross-check queries

con.sql("SHOW TABLES;")
┌─────────────┐
│    name     │
│   varchar   │
├─────────────┤
│ Events      │
│ Experiments │
│ Trials      │
└─────────────┘

Show the distinct EventTypes in alphabetical order

con.sql("SELECT DISTINCT EventType FROM Events ORDER BY EventType")
┌──────────────────────────────────────────────────────┐
│                      EventType                       │
│                       varchar                        │
├──────────────────────────────────────────────────────┤
│ Blue button pressed                                  │
│ Criterion achieved                                   │
│ Criterion count 1                                    │
│ Criterion count 2                                    │
│ Criterion count 3                                    │
│ Criterion count 4                                    │
│ Criterion count 5                                    │
│ Criterion count 6                                    │
│ Criterion count 7                                    │
│ Criterion count 8                                    │
│         ·                                            │
│         ·                                            │
│         ·                                            │
│ Habit formation incorrect response                   │
│ Keyboard interrupt - exiting...                      │
│ Manual touch recorded - Session under manual control │
│ Process paused                                       │
│ Process resumed                                      │
│ Session ended                                        │
│ Session terminated - trial limit reached             │
│ Start tone played                                    │
│ Touch-pad status read                                │
│ Touch-pad status read during sleep                   │
├──────────────────────────────────────────────────────┤
│                  46 rows (20 shown)                  │
└──────────────────────────────────────────────────────┘

List all of the Experiments

experiments_df = con.sql("SELECT * FROM Experiments").df()
experiments_df
ExperimentID Cohort SubjectName SubjectNumber SessionNumber ExperimentType Comment DateTime LogFileName Parameters
Loading ITables v2.1.4 from the init_notebook_mode cell... (need help?)

Get the information for Experiment with ID = 128 (the arbitrary id assigned when loaded)

con.sql(
    """
    SELECT
        ExperimentID, 
        Cohort,
        SubjectName, 
        SubjectNumber, 
        SessionNumber,
        ExperimentType, 
        Comment, 
        DateTime, 
        LogFileName
    FROM Experiments 
    WHERE ExperimentID = 128;
"""
).df()
ExperimentID Cohort SubjectName SubjectNumber SessionNumber ExperimentType Comment DateTime LogFileName
Loading ITables v2.1.4 from the init_notebook_mode cell... (need help?)

Reconstruct the event information for a specific experiment

con.sql(
    """
    SELECT 
        -- Trials.TrialID,
        Trials.TrialNumber,
        -- TrialStartTime,
        -- TrialEndTime,
        -- EventID, 
        EventNumber, 
        EventTime, 
        EventType,
        EventElapsedTime
    FROM Experiments
    INNER JOIN Trials ON Experiments.ExperimentID = Trials.ExperimentID
    INNER JOIN Events ON Trials.TrialID = Events.TrialID
    WHERE Experiments.ExperimentID = 128;
"""
).df()
TrialNumber EventNumber EventTime EventType EventElapsedTime
Loading ITables v2.1.4 from the init_notebook_mode cell... (need help?)

Reconstruct the event information for a specific subject (gio) and session number (2)

con.sql(
    """
    SELECT 
        -- Experiments.SubjectName,
        -- Experiments.SessionNumber,
        -- Trials.TrialID,
        Trials.TrialNumber,
        -- TrialStartTime,
        -- TrialEndTime,
        -- EventID, 
        EventNumber, 
        EventTime, 
        EventType,
        EventElapsedTime
    FROM Experiments
    INNER JOIN Trials ON Experiments.ExperimentID = Trials.ExperimentID
    INNER JOIN Events ON Trials.TrialID = Events.TrialID
    WHERE Experiments.SubjectName = 'gio' AND  Experiments.SessionNumber = 2;
"""
).df()
TrialNumber EventNumber EventTime EventType EventElapsedTime
Loading ITables v2.1.4 from the init_notebook_mode cell... (need help?)

Close the connection to the database

INFO     | Database connection closed