Σύγκριση μεθόδων
Εξετάστε τις επιλεγμένες μεθόδους δίπλα-δίπλα· οι γραμμές που διαφέρουν επισημαίνονται.
| Πολυβραχίονες Κλέφτες (UCB, Δειγματοληψία Thompson)× | Δοκιμή A/B (Διαδικτυακό Ελεγχόμενο Πείραμα)× | Τυχαιοποιημένη Ελεγχόμενη Δοκιμή (ΤΕΔ)× | Σχεδιασμός διαδοχικών / ομαδικών διαδοχικών δοκιμών× | |
|---|---|---|---|---|
| Πεδίο | Πειραματικός Σχεδιασμός | Πειραματικός Σχεδιασμός | Πειραματικός Σχεδιασμός | Πειραματικός Σχεδιασμός |
| Οικογένεια | Hypothesis test | Hypothesis test | Hypothesis test | Hypothesis test |
| Έτος προέλευσης≠ | 1952 | 1935 | 1948 | 1979 |
| Δημιουργός≠ | Robbins (1952); UCB1 by Auer et al. (2002); Thompson sampling by Thompson (1933) | Ron Kohavi et al. (Microsoft); conceptual roots in R. A. Fisher's randomized experiments (1935) | James Lind (early precursor, 1747); modern formulation: Austin Bradford Hill & Medical Research Council (1948) | O'Brien & Fleming; Pocock; Lan & DeMets |
| Τύπος≠ | Sequential decision / bandit algorithm | Parametric comparison (frequentist or Bayesian) | Interventional comparative study | Adaptive stopping trial design |
| Θεμελιώδης πηγή≠ | Auer, P., Cesa-Bianchi, N., & Fischer, P. (2002). Finite-Time Analysis of the Multiarmed Bandit Problem. Machine Learning, 47(2–3), 235–256. DOI ↗ | Kohavi, R., Tang, D., & Xu, Y. (2020). Trustworthy Online Controlled Experiments: A Practical Guide to A/B Testing. Cambridge University Press. ISBN: 9781108724265 | Schulz, K.F., Altman, D.G., Moher, D., for the CONSORT Group (2010). CONSORT 2010 Statement: Updated Guidelines for Reporting Parallel Group Randomised Trials. BMJ, 340, c332. DOI ↗ | O'Brien, P.C. & Fleming, T.R. (1979). A Multiple Testing Procedure for Clinical Trials. Biometrics, 35(3), 549–556. DOI ↗ |
| Εναλλακτικές ονομασίες≠ | MAB, bandit algorithm, UCB1, Thompson sampling | split test, controlled experiment, two-variant test, A/B Testi (Online Kontrollü Deney) | RCT, randomised controlled trial, clinical trial, Randomize Kontrollü Çalışma (RCT) Tasarımı | group sequential design, adaptive stopping design, Ardışık Deneme Tasarımı (Sequential / Group Sequential) |
| Συναφείς≠ | 4 | 4 | 7 | 3 |
| Σύνοψη≠ | The multi-armed bandit (MAB) is an adaptive experimental framework that allocates trials sequentially across competing arms to minimise cumulative regret while simultaneously learning which arm performs best. Formalised by Robbins in 1952 and given finite-time guarantees by Auer et al. (2002), it balances exploration of uncertain options against exploitation of currently known best options — outperforming classical A/B testing whenever early stopping or cost-sensitive allocation matters. | An A/B test is a randomized controlled experiment that simultaneously exposes two groups of users to a control variant (A) and a treatment variant (B) in order to determine whether a measured outcome differs significantly between them. The modern online controlled experiment framework was systematized by Ron Kohavi and colleagues at Microsoft in the early 2000s, building on R. A. Fisher's classical randomization principles from 1935. It is the dominant causal inference tool in web product development, digital marketing, and experimentation platforms. | A randomized controlled trial (RCT) is the gold standard experimental design in clinical and health research, in which participants are randomly allocated to a treatment group or a control group so that the effect of an intervention can be measured with the highest possible degree of internal validity. The modern parallel-group RCT was formalized by Austin Bradford Hill and the Medical Research Council in their landmark streptomycin trial of 1948, and its reporting is governed today by the CONSORT 2010 guidelines (Schulz et al., 2010). | Sequential and group sequential trial designs allow a study to be stopped early — or continued — based on interim analyses conducted as data accumulate. The core framework was formalised by O'Brien and Fleming in 1979 and extended by Lan and DeMets's alpha-spending approach, and it controls the overall Type I error rate across all planned looks by pre-specifying both efficacy and futility boundaries before enrolment begins. |
| ScholarGateΣύνολο δεδομένων ↗ |
|
|
|
|