INFO | Data directory: /Users/mjboothaus/code/github/databooth/horse-logic/data
INFO | Data directory purpose: Parent directory for raw and processed data
INFO | Sql directory: /Users/mjboothaus/code/github/databooth/horse-logic/sql
INFO | Sql directory purpose: Store SQL scripts
INFO | Output directory: /Users/mjboothaus/code/github/databooth/horse-logic/notebooks/results/RPE
INFO | Output directory purpose: Store output files and results by experiment type
INFO | Logfiles directory: /Users/mjboothaus/code/github/databooth/horse-logic/data/results/zips/data_17Jan2020_email_hillydale_equine
INFO | Logfiles directory purpose: Store for the raw log files
INFO | Notebooks directory: /Users/mjboothaus/code/github/databooth/horse-logic/notebooks
INFO | Notebooks directory purpose: Jupyter notebooks for performing analysis
INFO | Existing database file deleted: /Users/mjboothaus/code/github/databooth/horse-logic/data/Experiments_RPE_2023_Q4.ddb
INFO | Database file path: /Users/mjboothaus/code/github/databooth/horse-logic/data/Experiments_RPE_2023_Q4.ddb
INFO | Database purpose: Main project databases (outputs) by experiment type
INFO | Project initialised (RPE): config defined in project_config.yaml
Purpose
This notebook serves as an exploratory tool for examining the log files produced during the horse behavioural experiments conducted in October and November 2023. It facilitates the experimentation with text parsing techniques on the files before they are imported into a DuckDB database.
The primary objectives are:
- To reconcile which log files should be included in or excluded from the analysis.
- To conduct experiments with regular expressions (
regex
) aimed at extracting pertinent data and fields from the log files.
Experiment details and naming conventions
There are two main types of Experiment:
- Reward Prediction (RPE)
- Cognitive Bias (CB)
The RPE have the following subtypes:
RPE-A
: acquisition of responseRPE-H
: habit formationRPE-E
: extinction of responseRPE-ER
: extinction prior to reinstatement of responseRPE-R
: reinstatement of response
RPE-type experiments
A new experiment type (RPE-ER
) was created during the experiments which was not in the original specification.
Logfile Exclusion rules
Logfiles that are from test runs and also bad data need to be excluded from the analysis.
Rules are case-insensitive. Files which satisfy the following conditions are excluded:
- All files with
_TEST_
as the subject name - All files with
_FRECKLE_
as the subject name - Some files with
_BONNIE_
(14 legitimate files - TODO: CH to confirm details) - All files with
test
in theComment
field - Possibly some logs with very short run-times - TODO: CH to confirm the files that have been identified
- All files with
_OLIVE_
as the subject name - probably discard, treat as optional for now (TODO: CH to confirm treatment)
Problems with log file names during experiments
- Appears that for greater than 20 trials (subjects?) the value in the filename was reported as
NaN
. e.g. seepumba
experiments. - Other
NaN
s? There are 31 files withNaN
(some are restarts) - TODO: Confirm with CH why were there restarts? Why not “new” experiment?
Time differences
For each trial we calculate the following time differences:
- Time delta: (
touch
datetime -start tone
datetime) - Time delta: (
Next start
datetime -dispense of pellets
datetime) - Time delta: (
Dispense final pellets
datetime -start tone
datetime)
We use item 3 as cross check on the consistency of previous time deltas.
These calculated quantities are the same for all RPE
-type experiments.
Setup project & directories
# project.display_file_with_highlighting("project_config.yaml")
Get subject info
= Subject(project)
subject
= subject.get_subject_info() subject_df
INFO | Loaded subject info from: /Users/mjboothaus/code/github/databooth/horse-logic/docs/from_CH/Cohort data for MB.xlsx
INFO | Subject count: 22
INFO | Sorted subject names:
apollo, ash, atom, bonnie, clover, dodge, dougie, dusty, filly, freya, george, gio, jelly, molly, mowgli, myrtle,
nix, olive, pumba, smudge, teddy, yoshi
Log file reconciliation
Initial exclusions (rule-based)
INFO | Logs: 267 log files in /Users/mjboothaus/code/github/databooth/horse-logic/data/results/zips/data_17Jan2020_email_hillydale_equine
INFO | Included logfiles: 267
INFO | Excluded logfiles: 114
INFO | List of files to be excluded in analysis (initially). 114 rows exported to CSV.
Included log files
Initial list of log files that are included based on rules specified above.
INFO | List of files to be included in analysis. 267 rows exported to CSV.
Sessions summary by subject name
1. Atom: 9 session(s)
original_filename | datetime | session_number | experiment_type | time_dff | |
---|---|---|---|---|---|
Loading ITables v2.1.4 from the init_notebook_mode cell...
(need help?) |
2. Ash: 14 session(s)
original_filename | datetime | session_number | experiment_type | time_dff | |
---|---|---|---|---|---|
Loading ITables v2.1.4 from the init_notebook_mode cell...
(need help?) |
3. Mowgli: 10 session(s)
original_filename | datetime | session_number | experiment_type | time_dff | |
---|---|---|---|---|---|
Loading ITables v2.1.4 from the init_notebook_mode cell...
(need help?) |
4. Teddy: 12 session(s)
original_filename | datetime | session_number | experiment_type | time_dff | |
---|---|---|---|---|---|
Loading ITables v2.1.4 from the init_notebook_mode cell...
(need help?) |
5. Dodge: 10 session(s)
original_filename | datetime | session_number | experiment_type | time_dff | |
---|---|---|---|---|---|
Loading ITables v2.1.4 from the init_notebook_mode cell...
(need help?) |
6. Filly: 10 session(s)
original_filename | datetime | session_number | experiment_type | time_dff | |
---|---|---|---|---|---|
Loading ITables v2.1.4 from the init_notebook_mode cell...
(need help?) |
7. Dougie: 9 session(s)
original_filename | datetime | session_number | experiment_type | time_dff | |
---|---|---|---|---|---|
Loading ITables v2.1.4 from the init_notebook_mode cell...
(need help?) |
8. Bonnie: 74 session(s)
original_filename | datetime | session_number | experiment_type | time_dff | |
---|---|---|---|---|---|
Loading ITables v2.1.4 from the init_notebook_mode cell...
(need help?) |
9. Apollo: 8 session(s)
original_filename | datetime | session_number | experiment_type | time_dff | |
---|---|---|---|---|---|
Loading ITables v2.1.4 from the init_notebook_mode cell...
(need help?) |
10. Molly: 8 session(s)
original_filename | datetime | session_number | experiment_type | time_dff | |
---|---|---|---|---|---|
Loading ITables v2.1.4 from the init_notebook_mode cell...
(need help?) |
11. Jelly: 12 session(s)
original_filename | datetime | session_number | experiment_type | time_dff | |
---|---|---|---|---|---|
Loading ITables v2.1.4 from the init_notebook_mode cell...
(need help?) |
12. Smudge: 10 session(s)
original_filename | datetime | session_number | experiment_type | time_dff | |
---|---|---|---|---|---|
Loading ITables v2.1.4 from the init_notebook_mode cell...
(need help?) |
13. George: 9 session(s)
original_filename | datetime | session_number | experiment_type | time_dff | |
---|---|---|---|---|---|
Loading ITables v2.1.4 from the init_notebook_mode cell...
(need help?) |
14. Myrtle: 9 session(s)
original_filename | datetime | session_number | experiment_type | time_dff | |
---|---|---|---|---|---|
Loading ITables v2.1.4 from the init_notebook_mode cell...
(need help?) |
15. Yoshi: 8 session(s)
original_filename | datetime | session_number | experiment_type | time_dff | |
---|---|---|---|---|---|
Loading ITables v2.1.4 from the init_notebook_mode cell...
(need help?) |
16. Nix: 7 session(s)
original_filename | datetime | session_number | experiment_type | time_dff | |
---|---|---|---|---|---|
Loading ITables v2.1.4 from the init_notebook_mode cell...
(need help?) |
17. Gio: 8 session(s)
original_filename | datetime | session_number | experiment_type | time_dff | |
---|---|---|---|---|---|
Loading ITables v2.1.4 from the init_notebook_mode cell...
(need help?) |
18. Dusty: 9 session(s)
original_filename | datetime | session_number | experiment_type | time_dff | |
---|---|---|---|---|---|
Loading ITables v2.1.4 from the init_notebook_mode cell...
(need help?) |
19. Freya: 8 session(s)
original_filename | datetime | session_number | experiment_type | time_dff | |
---|---|---|---|---|---|
Loading ITables v2.1.4 from the init_notebook_mode cell...
(need help?) |
20. Olive: 3 session(s)
original_filename | datetime | session_number | experiment_type | time_dff | |
---|---|---|---|---|---|
Loading ITables v2.1.4 from the init_notebook_mode cell...
(need help?) |
21. Pumba: 11 session(s)
original_filename | datetime | session_number | experiment_type | time_dff | |
---|---|---|---|---|---|
Loading ITables v2.1.4 from the init_notebook_mode cell...
(need help?) |
22. Clover: 9 session(s)
original_filename | datetime | session_number | experiment_type | time_dff | |
---|---|---|---|---|---|
Loading ITables v2.1.4 from the init_notebook_mode cell...
(need help?) |
Session Summary - by Subject name and Session count
Subject number | Subject name | Session count |
---|---|---|
Loading ITables v2.1.4 from the init_notebook_mode cell...
(need help?) |
INFO | Session overview (subject session counts). 22 rows exported to CSV.
original_filename | subject_name | experiment_type | session_number | datetime | time_diff |
---|---|---|---|---|---|
Loading ITables v2.1.4 from the init_notebook_mode cell...
(need help?) |
Specific analysis for Bonnie log files
Used some Bonnie log files for testing - not all experiments are valid to include
original_filename | subject_name | experiment_type | session_number | datetime | time_diff | |
---|---|---|---|---|---|---|
Loading ITables v2.1.4 from the init_notebook_mode cell...
(need help?) |
INFO | Bonnie log file exclude list. 36 rows exported to CSV.
INFO | File Experiment_2023-10-11T16:56:09.875604_bonnie_58_RPE-A.log has been included.
INFO | File Experiment_2023-10-16T16:55:54.839279_bonnie_68_RPE-A.log has been included.
INFO | File Experiment_2023-10-19T14:37:56.973616_bonnie_72_RPE-H.log has been included.
INFO | # Files specifically included: 3
Write out final lists of excluded and included log files
INFO | List of all the log files excluded. 150 rows exported to CSV.
INFO | Summary of ALL the experiments (both included and excluded). 267 rows exported to CSV.
INFO | File list of all included log files. 234 rows exported to CSV.
Key output file here is all_included_files.csv
.
It defines all of the log files that will be included in the load of data to the DuckDB database for further analysis in the next notebook.
The next notebook to run is notebooks/logfile-to-database-RPE.ipynb
.