INFO | Data directory: /Users/mjboothaus/code/github/databooth/horse-logic/data
INFO | Data directory purpose: Parent directory for raw and processed data
INFO | Sql directory: /Users/mjboothaus/code/github/databooth/horse-logic/sql
INFO | Sql directory purpose: Store SQL scripts
INFO | Output directory: /Users/mjboothaus/code/github/databooth/horse-logic/notebooks/results/CB
INFO | Output directory purpose: Store output files and results by experiment type
INFO | Logfiles directory: /Users/mjboothaus/code/github/databooth/horse-logic/data/results/zips/cb_data
INFO | Logfiles directory purpose: Store for the raw log files
INFO | Notebooks directory: /Users/mjboothaus/code/github/databooth/horse-logic/notebooks
INFO | Notebooks directory purpose: Jupyter notebooks for performing analysis
INFO | Existing database file deleted: /Users/mjboothaus/code/github/databooth/horse-logic/data/Experiments_CB_2023_Q4.ddb
INFO | Database file path: /Users/mjboothaus/code/github/databooth/horse-logic/data/Experiments_CB_2023_Q4.ddb
INFO | Database purpose: Main project databases (outputs) by experiment type
INFO | Project initialised (CB): config defined in project_config.yaml
Purpose of This Notebook
This notebook serves as an exploratory tool for examining the log files produced during the Cognitive Bias (CB) horse behavioural experiments conducted in October and November 2023. It facilitates the experimentation with text parsing techniques on the files before they are imported into a database. The primary objective is to reconcile which log files should be included in or excluded from the analysis.
Experiment details and naming conventions
Logfile Exclusion rules
Logfiles that are from test runs and also bad data need to be excluded from the analysis.
Rules are case-insensitive. Files which satisfy the following conditions are excluded:
- TODO
Problems with log file names during experiments
- TODO
Time differences
For each trial we calculate the following time differences:
Cognitive Bias Experiments
- Extract “RIGHT” or “LEFT” from the Comment field.
- Also extract details from log file name
- Training experiments:
- Type 1
- Type 2
- Testing experiments:
- Type 1
- Type 2
- Type 3
- Type 4 (re-uses Type 1 with indicator to distinguish in Comment field)
Time differences
For each trial:
Training Type 1 (randomised versus fixed): - Start
datetime = Green button pressed and horse is released - Capture positive (GO
) / negative (NOGO
) response time subject to maximum cutoff time (e.g. 30 seconds) - In addition to left/right positioning of feed, there are also median, near positive and near negative positions.
TODO check with CH: Test only - be in all?
Log file reconciliation
Setup project & directories
Get Subject info
INFO | Loaded subject info from: /Users/mjboothaus/code/github/databooth/horse-logic/docs/from_CH/Cohort data for MB.xlsx
INFO | Subject count: 22
INFO | Sorted subject names:
apollo, ash, atom, bonnie, clover, dodge, dougie, dusty, filly, freya, george, gio, jelly, molly, mowgli, myrtle,
nix, olive, pumba, smudge, teddy, yoshi
Initial exclusions (rule-based)
Rules:
Ignore all CBF1 files - data will not be analysed (CH: What does CBF1 mean? and similar)
Ignore all Olive files (6 log files)
Ignore Maple CBT1 on 9 Oct (CH: Did Maple have a different name?)
Run check to see how many N bucket GO responses exceed 30s in first 3 days (CH: Please explain)
29
*test*.log
files (exclude)
### Override - files to be excluded specified in following CSV file 16 October 2024 by email from CH
= (
exclude_16Oct2024_csv / "docs" / "from_CH" / "CB_all_included_files_CH_16Oct2024.csv"
project.project_dir )
= pd.read_csv(
exclude_16Oct2024_df =None, names=["Exclude", "csv_index", "logfilename"]
exclude_16Oct2024_csv, header )
exclude_16Oct2024_df.head()
Exclude | csv_index | logfilename |
---|---|---|
Loading ITables v2.1.4 from the init_notebook_mode cell...
(need help?) |
# Create LOGFILES_TO_INCLUDE_16Oct2024
= exclude_16Oct2024_df[exclude_16Oct2024_df["Exclude"].isna()][
LOGFILES_TO_INCLUDE_16Oct2024 "logfilename"
].tolist()
# Create LOGFILES_TO_EXCLUDE_16Oct2024
= exclude_16Oct2024_df[~exclude_16Oct2024_df["Exclude"].isna()][
LOGFILES_TO_EXCLUDE_16Oct2024 "logfilename"
].tolist()
len(LOGFILES_TO_INCLUDE_16Oct2024) + len(LOGFILES_TO_EXCLUDE_16Oct2024)
244
len(LOGFILES_TO_INCLUDE_16Oct2024)
183
len(list(set(LOGFILES_TO_INCLUDE_16Oct2024)))
183
Identify test
log files (to exclude)
INFO | Testing log files list. 29 rows exported to CSV.
Identify olive
log files to exclude
INFO | Olive log files list. 6 rows exported to CSV.
Identify filly
log files to exclude
WARNING: These are no filly
log files identified.
INFO | Filly log files list. 12 rows exported to CSV.
#### Override the previous list
= Logs(path=LOGFILES_DIR, patterns=LOGFILES_TO_INCLUDE_16Oct2024, include=True) logs
INFO | Logs: 183 log files in /Users/mjboothaus/code/github/databooth/horse-logic/data/results/zips/cb_data
INFO | Included logfiles: 183
INFO | Excluded logfiles: 96
Excluded |
---|
Loading ITables v2.1.4 from the init_notebook_mode cell...
(need help?) |
Included log files
INFO | List of files to be included in analysis. 183 rows exported to CSV.
Sessions summary by subject name
1. Atom: 9 session(s)
original_filename | datetime | session_number | experiment_type | time_dff | |
---|---|---|---|---|---|
Loading ITables v2.1.4 from the init_notebook_mode cell...
(need help?) |
2. Ash: 8 session(s)
original_filename | datetime | session_number | experiment_type | time_dff | |
---|---|---|---|---|---|
Loading ITables v2.1.4 from the init_notebook_mode cell...
(need help?) |
3. Mowgli: 8 session(s)
original_filename | datetime | session_number | experiment_type | time_dff | |
---|---|---|---|---|---|
Loading ITables v2.1.4 from the init_notebook_mode cell...
(need help?) |
4. Teddy: 8 session(s)
original_filename | datetime | session_number | experiment_type | time_dff | |
---|---|---|---|---|---|
Loading ITables v2.1.4 from the init_notebook_mode cell...
(need help?) |
5. Dodge: 8 session(s)
original_filename | datetime | session_number | experiment_type | time_dff | |
---|---|---|---|---|---|
Loading ITables v2.1.4 from the init_notebook_mode cell...
(need help?) |
6. Filly: 8 session(s)
original_filename | datetime | session_number | experiment_type | time_dff | |
---|---|---|---|---|---|
Loading ITables v2.1.4 from the init_notebook_mode cell...
(need help?) |
7. Dougie: 8 session(s)
original_filename | datetime | session_number | experiment_type | time_dff | |
---|---|---|---|---|---|
Loading ITables v2.1.4 from the init_notebook_mode cell...
(need help?) |
8. Bonnie: 9 session(s)
original_filename | datetime | session_number | experiment_type | time_dff | |
---|---|---|---|---|---|
Loading ITables v2.1.4 from the init_notebook_mode cell...
(need help?) |
9. Apollo: 10 session(s)
original_filename | datetime | session_number | experiment_type | time_dff | |
---|---|---|---|---|---|
Loading ITables v2.1.4 from the init_notebook_mode cell...
(need help?) |
10. Molly: 9 session(s)
original_filename | datetime | session_number | experiment_type | time_dff | |
---|---|---|---|---|---|
Loading ITables v2.1.4 from the init_notebook_mode cell...
(need help?) |
11. Jelly: 8 session(s)
original_filename | datetime | session_number | experiment_type | time_dff | |
---|---|---|---|---|---|
Loading ITables v2.1.4 from the init_notebook_mode cell...
(need help?) |
12. Smudge: 10 session(s)
original_filename | datetime | session_number | experiment_type | time_dff | |
---|---|---|---|---|---|
Loading ITables v2.1.4 from the init_notebook_mode cell...
(need help?) |
13. George: 7 session(s)
original_filename | datetime | session_number | experiment_type | time_dff | |
---|---|---|---|---|---|
Loading ITables v2.1.4 from the init_notebook_mode cell...
(need help?) |
14. Myrtle: 10 session(s)
original_filename | datetime | session_number | experiment_type | time_dff | |
---|---|---|---|---|---|
Loading ITables v2.1.4 from the init_notebook_mode cell...
(need help?) |
15. Yoshi: 10 session(s)
original_filename | datetime | session_number | experiment_type | time_dff | |
---|---|---|---|---|---|
Loading ITables v2.1.4 from the init_notebook_mode cell...
(need help?) |
16. Nix: 10 session(s)
original_filename | datetime | session_number | experiment_type | time_dff | |
---|---|---|---|---|---|
Loading ITables v2.1.4 from the init_notebook_mode cell...
(need help?) |
17. Gio: 8 session(s)
original_filename | datetime | session_number | experiment_type | time_dff | |
---|---|---|---|---|---|
Loading ITables v2.1.4 from the init_notebook_mode cell...
(need help?) |
18. Dusty: 10 session(s)
original_filename | datetime | session_number | experiment_type | time_dff | |
---|---|---|---|---|---|
Loading ITables v2.1.4 from the init_notebook_mode cell...
(need help?) |
19. Freya: 7 session(s)
original_filename | datetime | session_number | experiment_type | time_dff | |
---|---|---|---|---|---|
Loading ITables v2.1.4 from the init_notebook_mode cell...
(need help?) |
INFO | Subject olive: No experiments conducted
21. Pumba: 8 session(s)
original_filename | datetime | session_number | experiment_type | time_dff | |
---|---|---|---|---|---|
Loading ITables v2.1.4 from the init_notebook_mode cell...
(need help?) |
22. Clover: 10 session(s)
original_filename | datetime | session_number | experiment_type | time_dff | |
---|---|---|---|---|---|
Loading ITables v2.1.4 from the init_notebook_mode cell...
(need help?) |
Session Summary
Subject number | Subject name | Session count |
---|---|---|
Loading ITables v2.1.4 from the init_notebook_mode cell...
(need help?) |
INFO | Session overview (subject session counts). 22 rows exported to CSV.
original_filename | subject_name | experiment_type | session_number | datetime | time_diff |
---|---|---|---|---|---|
Loading ITables v2.1.4 from the init_notebook_mode cell...
(need help?) |
INFO | List of log files excluded. 96 rows exported to CSV.
INFO | Experiment summary. 183 rows exported to CSV.
INFO | File list of log files included. 183 rows exported to CSV.