• dondelelcaro@lemmy.world
    link
    fedilink
    arrow-up
    57
    ·
    5 months ago

    Maybe not the hardest, but still challenging. Unknown biases in training data are a challenge in any experimental design. Opaque ML frequently makes them more challenging to discover.

    • nova_ad_vitum@lemmy.ca
      link
      fedilink
      arrow-up
      25
      ·
      edit-2
      5 months ago

      The unknown biases issue has no real solution. In this same example if instead of something simple like snow in the background, it turned out that the photographs of wolves were taken using zoom lenses (since photogs don’t want to get near wild animals) while the dog photos were closeup and the ML was really just training to recognize subtle photographic artifacts caused by the zoom lenses, this would be extremely difficult to detect let alone prove.

      • dondelelcaro@lemmy.world
        link
        fedilink
        arrow-up
        3
        ·
        5 months ago

        Exactly.

        The general approach is to use interpretable models where you can understand how the model works and what features it uses to discriminate, but that doesn’t work for all ML approaches (and even when it does our understanding is incomplete.)