Logfile Reconciliation - Cognitive Bias (CB)

Authors

Affiliations

Cathrynne Henshall

Charles Sturt University

Michael J. Booth

Published

Tue Nov 12, 2024 11:38 AM

Abstract

Reconciliation of the CB logfiles generated during the experiments.

Purpose of This Notebook

This notebook serves as an exploratory tool for examining the log files produced during the Cognitive Bias (CB) horse behavioural experiments conducted in October and November 2023. It facilitates the experimentation with text parsing techniques on the files before they are imported into a database. The primary objective is to reconcile which log files should be included in or excluded from the analysis.

Experiment details and naming conventions

Logfile Exclusion rules

Logfiles that are from test runs and also bad data need to be excluded from the analysis.

Rules are case-insensitive. Files which satisfy the following conditions are excluded:

TODO

Problems with log file names during experiments

TODO

Time differences

For each trial we calculate the following time differences:

Cognitive Bias Experiments

Extract “RIGHT” or “LEFT” from the Comment field.
Also extract details from log file name
1. Training experiments:
- Type 1
- Type 2
1. Testing experiments:
- Type 1
- Type 2
- Type 3
- Type 4 (re-uses Type 1 with indicator to distinguish in Comment field)

Time differences

For each trial:

Training Type 1 (randomised versus fixed): - Start datetime = Green button pressed and horse is released - Capture positive (GO) / negative (NOGO) response time subject to maximum cutoff time (e.g. 30 seconds) - In addition to left/right positioning of feed, there are also median, near positive and near negative positions.

TODO check with CH: Test only - be in all?

Log file reconciliation

Setup project & directories

This is the init_notebook_mode cell from ITables v2.1.4
(you should not see this message - is your notebook trusted?)

INFO     | Data directory: /Users/mjboothaus/code/github/databooth/horse-logic/data
INFO     | Data directory purpose: Parent directory for raw and processed data
INFO     | Sql directory: /Users/mjboothaus/code/github/databooth/horse-logic/sql
INFO     | Sql directory purpose: Store SQL scripts
INFO     | Output directory: /Users/mjboothaus/code/github/databooth/horse-logic/notebooks/results/CB
INFO     | Output directory purpose: Store output files and results by experiment type
INFO     | Logfiles directory: /Users/mjboothaus/code/github/databooth/horse-logic/data/results/zips/cb_data
INFO     | Logfiles directory purpose: Store for the raw log files
INFO     | Notebooks directory: /Users/mjboothaus/code/github/databooth/horse-logic/notebooks
INFO     | Notebooks directory purpose: Jupyter notebooks for performing analysis
INFO     | Existing database file deleted: /Users/mjboothaus/code/github/databooth/horse-logic/data/Experiments_CB_2023_Q4.ddb
INFO     | Database file path: /Users/mjboothaus/code/github/databooth/horse-logic/data/Experiments_CB_2023_Q4.ddb
INFO     | Database purpose: Main project databases (outputs) by experiment type
INFO     | Project initialised (CB): config defined in project_config.yaml

Get Subject info

INFO     | Loaded subject info from: /Users/mjboothaus/code/github/databooth/horse-logic/docs/from_CH/Cohort data for MB.xlsx
INFO     | Subject count: 22
INFO     | Sorted subject names:
    apollo, ash, atom, bonnie, clover, dodge, dougie, dusty, filly, freya, george, gio, jelly, molly, mowgli, myrtle, 
    nix, olive, pumba, smudge, teddy, yoshi

Initial exclusions (rule-based)

Rules:

Ignore all CBF1 files - data will not be analysed (CH: What does CBF1 mean? and similar)
Ignore all Olive files (6 log files)
Ignore Maple CBT1 on 9 Oct (CH: Did Maple have a different name?)
Run check to see how many N bucket GO responses exceed 30s in first 3 days (CH: Please explain)
29 *test*.log files (exclude)

### Override - files to be excluded specified in following CSV file 16 October 2024 by email from CH

exclude_16Oct2024_csv = (
    project.project_dir / "docs" / "from_CH" / "CB_all_included_files_CH_16Oct2024.csv"
)

exclude_16Oct2024_df = pd.read_csv(
    exclude_16Oct2024_csv, header=None, names=["Exclude", "csv_index", "logfilename"]
)

exclude_16Oct2024_df.head()

Exclude	csv_index	logfilename
Loading ITables v2.1.4 from the `init_notebook_mode` cell... (need help?)

# Create LOGFILES_TO_INCLUDE_16Oct2024
LOGFILES_TO_INCLUDE_16Oct2024 = exclude_16Oct2024_df[exclude_16Oct2024_df["Exclude"].isna()][
    "logfilename"
].tolist()

# Create LOGFILES_TO_EXCLUDE_16Oct2024
LOGFILES_TO_EXCLUDE_16Oct2024 = exclude_16Oct2024_df[~exclude_16Oct2024_df["Exclude"].isna()][
    "logfilename"
].tolist()

len(LOGFILES_TO_INCLUDE_16Oct2024) + len(LOGFILES_TO_EXCLUDE_16Oct2024)

len(LOGFILES_TO_INCLUDE_16Oct2024)

len(list(set(LOGFILES_TO_INCLUDE_16Oct2024)))

Identify `test` log files (to exclude)

INFO     | Testing log files list. 29 rows exported to CSV.

results/CB/test_log_files.csv

Identify `olive` log files to exclude

INFO     | Olive log files list. 6 rows exported to CSV.

results/CB/olive_log_files.csv

Identify `filly` log files to exclude

WARNING: These are no filly log files identified.

INFO     | Filly log files list. 12 rows exported to CSV.

results/CB/filly_log_files.csv

#### Override the previous list

logs = Logs(path=LOGFILES_DIR, patterns=LOGFILES_TO_INCLUDE_16Oct2024, include=True)

INFO     | Logs: 183 log files in /Users/mjboothaus/code/github/databooth/horse-logic/data/results/zips/cb_data
INFO     | Included logfiles: 183
INFO     | Excluded logfiles: 96

Excluded
Loading ITables v2.1.4 from the `init_notebook_mode` cell... (need help?)

Included log files

INFO     | List of files to be included in analysis. 183 rows exported to CSV.

results/CB/included_files.csv

Sessions summary by subject name

1. Atom: 9 session(s)

	original_filename	datetime	session_number	experiment_type	time_dff
Loading ITables v2.1.4 from the `init_notebook_mode` cell... (need help?)

2. Ash: 8 session(s)

	original_filename	datetime	session_number	experiment_type	time_dff
Loading ITables v2.1.4 from the `init_notebook_mode` cell... (need help?)

3. Mowgli: 8 session(s)

	original_filename	datetime	session_number	experiment_type	time_dff
Loading ITables v2.1.4 from the `init_notebook_mode` cell... (need help?)

4. Teddy: 8 session(s)

	original_filename	datetime	session_number	experiment_type	time_dff
Loading ITables v2.1.4 from the `init_notebook_mode` cell... (need help?)

5. Dodge: 8 session(s)

	original_filename	datetime	session_number	experiment_type	time_dff
Loading ITables v2.1.4 from the `init_notebook_mode` cell... (need help?)

6. Filly: 8 session(s)

	original_filename	datetime	session_number	experiment_type	time_dff
Loading ITables v2.1.4 from the `init_notebook_mode` cell... (need help?)

7. Dougie: 8 session(s)

	original_filename	datetime	session_number	experiment_type	time_dff
Loading ITables v2.1.4 from the `init_notebook_mode` cell... (need help?)

8. Bonnie: 9 session(s)

	original_filename	datetime	session_number	experiment_type	time_dff
Loading ITables v2.1.4 from the `init_notebook_mode` cell... (need help?)

9. Apollo: 10 session(s)

	original_filename	datetime	session_number	experiment_type	time_dff
Loading ITables v2.1.4 from the `init_notebook_mode` cell... (need help?)

10. Molly: 9 session(s)

	original_filename	datetime	session_number	experiment_type	time_dff
Loading ITables v2.1.4 from the `init_notebook_mode` cell... (need help?)

11. Jelly: 8 session(s)

	original_filename	datetime	session_number	experiment_type	time_dff
Loading ITables v2.1.4 from the `init_notebook_mode` cell... (need help?)

12. Smudge: 10 session(s)

	original_filename	datetime	session_number	experiment_type	time_dff
Loading ITables v2.1.4 from the `init_notebook_mode` cell... (need help?)

13. George: 7 session(s)

	original_filename	datetime	session_number	experiment_type	time_dff
Loading ITables v2.1.4 from the `init_notebook_mode` cell... (need help?)

14. Myrtle: 10 session(s)

	original_filename	datetime	session_number	experiment_type	time_dff
Loading ITables v2.1.4 from the `init_notebook_mode` cell... (need help?)

15. Yoshi: 10 session(s)

	original_filename	datetime	session_number	experiment_type	time_dff
Loading ITables v2.1.4 from the `init_notebook_mode` cell... (need help?)

16. Nix: 10 session(s)

	original_filename	datetime	session_number	experiment_type	time_dff
Loading ITables v2.1.4 from the `init_notebook_mode` cell... (need help?)

17. Gio: 8 session(s)

	original_filename	datetime	session_number	experiment_type	time_dff
Loading ITables v2.1.4 from the `init_notebook_mode` cell... (need help?)

18. Dusty: 10 session(s)

	original_filename	datetime	session_number	experiment_type	time_dff
Loading ITables v2.1.4 from the `init_notebook_mode` cell... (need help?)

19. Freya: 7 session(s)

	original_filename	datetime	session_number	experiment_type	time_dff
Loading ITables v2.1.4 from the `init_notebook_mode` cell... (need help?)

INFO     | Subject olive: No experiments conducted

21. Pumba: 8 session(s)

	original_filename	datetime	session_number	experiment_type	time_dff
Loading ITables v2.1.4 from the `init_notebook_mode` cell... (need help?)

22. Clover: 10 session(s)

	original_filename	datetime	session_number	experiment_type	time_dff
Loading ITables v2.1.4 from the `init_notebook_mode` cell... (need help?)

Session Summary

Subject number	Subject name	Session count
Loading ITables v2.1.4 from the `init_notebook_mode` cell... (need help?)

INFO     | Session overview (subject session counts). 22 rows exported to CSV.

results/CB/session_overview.csv

original_filename	subject_name	experiment_type	session_number	datetime	time_diff
Loading ITables v2.1.4 from the `init_notebook_mode` cell... (need help?)

INFO     | List of log files excluded. 96 rows exported to CSV.

results/CB/all_excluded_files.csv

INFO     | Experiment summary. 183 rows exported to CSV.

results/CB/experiment_summary_CB_2023_included.csv

INFO     | File list of log files included. 183 rows exported to CSV.

results/CB/all_included_files.csv