• brucethemoose@lemmy.world
    link
    fedilink
    English
    arrow-up
    2
    ·
    1 day ago

    It’s honestly not that big a deal, as it’s not like knowing anything about how it was trained (beyond the config) would help you modify it. It’s still highly modifiable. It’s not like anyone can afford to replicate it.

    It would be nice to publish the hyperparameters for research purposes, but… shrug.

    I think a subset of the exact training data/hyperparameters would help with quantization-aware-training, maybe, but that’s all I got.