27 Agreement vs Research Reference Diagnosis (Core Level)

Note for Pathologists: Performance metrics (Sensitivity, Specificity, etc.) for each pathologist compared to the research gold standard, calculated at the core level.

Per-pathologist decision metrics vs research gold standard (core level).
Pathologist	Mode	TP	TN	FP	FN	Sensitivity	Specificity	PPV	NPV	Accuracy

28 Mixed-Effects Model for Time Impact

Note for Pathologists: Mixed-effects statistical model analyzing the impact of AI on diagnosis time, accounting for variability between different pathologists and slides.

Mixed model (log-seconds) with random intercepts for pathologist and slide.
	Term	Estimate	Std. Error	t value
(Intercept)	(Intercept)	3.630	0.028	131.350
ModewithAI	ModewithAI	0.004	0.011	0.374

29 Subjective Benefit vs Accuracy and Time

Note for Pathologists: Analysis linking pathologists’ subjective feedback (was AI helpful? did they agree?) to objective changes in diagnosis time and accuracy.

Self-reported helpfulness/agreement vs time and accuracy deltas.
Pathologist	AIHelpful	AIAgree	median_delta_time	accuracy_gain_rate	accuracy_loss_rate	n
P1	AI gereksizdi	AI gereksiz şüphe koydu	31.29	0	0	2
P1	AI gereksizdi	AI tanısına katıldım	2.52	0	0	651
P1	AI gereksizdi	AI tanısına katılmadım	24.91	0	0	109
P1	AI tanı vermeme engel oldu	AI gereksiz şüphe koydu	23.47	0	0	12
P1	AI tanı vermeme engel oldu	AI tanısına katılmadım	35.20	0	0	11
P1	AI tanıya yardımcı oldu	AI tanısına katıldım	-0.30	0	0	35
P1	AI tanıya yardımcı oldu	AI tanısına katılmadım	4.83	0	0	8
P1	NA	NA	1.47	0	0	23
P2	AI gereksizdi	AI tümörü atlamış	-7.36	0	0	1
P2	AI tanı vermeme engel oldu	AI gereksiz şüphe koydu	-0.66	0	0	10
P2	AI tanı vermeme engel oldu	AI tanısına katılmadım	8.58	0	0	15
P2	AI tanıya yardımcı oldu	AI gereksiz şüphe koydu	-34.37	0	0	2
P2	AI tanıya yardımcı oldu	AI tanısına katıldım	-2.84	0	0	798
P2	AI tanıya yardımcı oldu	AI tanısına katılmadım	NA	0	0	1
P2	NA	NA	0.87	0	0	24
P3	AI gereksizdi	AI gereksiz şüphe koydu	61.59	0	0	1
P3	AI gereksizdi	AI tanısına katıldım	-5.43	0	0	373
P3	AI tanı vermeme engel oldu	AI gereksiz şüphe koydu	47.27	0	0	17
P3	AI tanı vermeme engel oldu	AI tanısına katılmadım	5.55	0	0	12
P3	AI tanı vermeme engel oldu	AI tümörü atlamış	85.49	0	0	1
P3	AI tanıya yardımcı oldu	AI gereksiz şüphe koydu	-4.28	0	0	1
P3	AI tanıya yardımcı oldu	AI tanısına katıldım	-5.72	0	0	387
P3	AI tanıya yardımcı oldu	AI tanısına katılmadım	7.30	0	0	27
P3	AI tanıya yardımcı oldu	AI tümörü atlamış	17.28	0	0	2
P3	AI tanıya yardımcı oldu	NA	61.98	0	0	3
P3	NA	AI tümörü atlamış	47.04	0	0	1
P3	NA	NA	-16.80	0	0	26
P4	AI gereksizdi	AI gereksiz şüphe koydu	75.76	0	0	1
P4	AI gereksizdi	AI tanısına katıldım	3.80	0	0	796
P4	AI gereksizdi	AI tanısına katılmadım	35.35	0	0	9
P4	AI gereksizdi	AI tümörü atlamış	23.53	0	0	3
P4	AI tanı vermeme engel oldu	AI gereksiz şüphe koydu	21.07	0	0	2
P4	AI tanı vermeme engel oldu	AI tanısına katıldım	-9.17	0	0	4
P4	AI tanı vermeme engel oldu	AI tanısına katılmadım	46.15	0	0	3
P4	AI tanıya yardımcı oldu	AI tanısına katıldım	8.57	0	0	11
P4	NA	NA	NA	NaN	NaN	22

30 Consult / IHC Ordering Impact (Proxy)

Note for Pathologists: Comparison of how often consultations or IHC were ordered (or indicated in the diagnosis) with and without AI assistance.

Consult/IHC signal before vs after AI (proxy from diagnosis labels).
Pathologist	Mode	n	consult_or_ihc	rate
P1	noAI	851	143	0.168
P1	withAI	851	46	0.054
P2	noAI	851	39	0.046
P2	withAI	851	20	0.024
P3	noAI	851	44	0.052
P3	withAI	851	21	0.025
P4	noAI	851	83	0.098
P4	withAI	851	21	0.025

31 Tumor Percentage Shift (noAI vs withAI)

Note for Pathologists: Summary of how much the reported tumor percentage changed when using AI compared to the no-AI baseline.

Shift in reported tumor percentage with AI assistance.
median_delta	iqr_delta	n
0	0	3404

32 Notes on Image Quality Sensitivity

Image quality variables are not yet captured; to evaluate artifact impact, add a quality_flag or artifact_type column during ingest, then stratify agreement/time metrics by that variable.
If quality metadata become available, reuse the structures above (e.g., add + quality_flag to the mixed model and compute agreement stratified by quality bins).