• SuckMyWang@lemmy.world
    link
    fedilink
    arrow-up
    116
    arrow-down
    1
    ·
    2 days ago

    Cool so we can just make up our own rules now. Well, all Microsoft products are freeware now because the same reason this guy

    • 乇ㄥ乇¢ㄒ尺ㄖ@infosec.pub
      link
      fedilink
      arrow-up
      1
      ·
      2 days ago

      Ok… so from now on … when I see a “repackaged” Microsoft product that for some reason… which I don’t care to know… doesn’t ask for a payment… I can use it without restrictions ?!! that’s really nice of you Microsoft … thank you.

  • themurphy@lemmy.ml
    link
    fedilink
    arrow-up
    83
    ·
    2 days ago

    Fair, then everything I can find on the Internet must be freeware too. Set the sails, matey!

    • SlopppyEngineer@lemmy.world
      link
      fedilink
      arrow-up
      44
      ·
      edit-2
      2 days ago

      No officer, this is not a pirated movie. It’s generated by an AI model I created and trained with data from the internet and the fact that it’s 99% identical to an existing movie is irrelevant.

      • Agathon@lemmy.dbzer0.com
        link
        fedilink
        English
        arrow-up
        29
        arrow-down
        1
        ·
        2 days ago

        my AI is so good, it generated one that’s 100% identical

        plus my AI uses less than 99% of the electricity of Microsoft’s

      • M500@lemmy.ml
        link
        fedilink
        English
        arrow-up
        1
        ·
        2 days ago

        Also, this ground breaking AI model I made to do this was umm accidentally erased and I also forgot how to do make it.

        Jury: “seems reasonable”

  • dustycups@aussie.zone
    link
    fedilink
    arrow-up
    37
    arrow-down
    2
    ·
    2 days ago

    From the article:

    Also, in 2022, several unidentified developers sued OpenAI and GitHub based on claims that the organizations used publicly posted programming code to train generative models in violation of software licensing terms

    They can argue about it not being a copy all they want. If there is a single GPL licenced line of code scraped then anything they produce is a derivative work & must be licenced GPL.

    nice.

  • Melllvar@startrek.website
    link
    fedilink
    English
    arrow-up
    23
    ·
    2 days ago

    He seems to be confusing “freeware”, which is basically a license for copyrighted work, with “public domain”, which is the absence of a copyright.

    • Elise@beehaw.org
      link
      fedilink
      arrow-up
      2
      ·
      2 days ago

      Yeah but anything you create automatically has a copyright, so for example this comment is not in the public domain. Its use is limited to the context I am using it in. That is, I expect it to be copied for federation purposes, but I wouldn’t say that AI is covered in this context.

      At least that’s the EU stance afaik. Like if I saw this comment on a billboard somewhere I’d see that as a breach of copyright and even privacy.

  • EnderMB@lemmy.world
    link
    fedilink
    arrow-up
    25
    arrow-down
    1
    ·
    2 days ago

    I’m fine with that, but let’s put some rules against this.

    • Any AI models should be able to determine the source of their data to a defined level of accuracy.
    • There should be a well-defined way to block data from being used by AI. If one of these ways (e.g. robots.txt) has been breached, the model has to be rebuilt without the data, and reparations made to the content owners.
    • ayaya@lemdro.id
      link
      fedilink
      English
      arrow-up
      4
      arrow-down
      3
      ·
      2 days ago

      What you’re asking for is literally impossible.

      A neural network is basically nothing more than a set of weights. If one word makes a weight go up by 0.0001 and then another word makes it go down by 0.0001, and you do that billions of times for billions of weights, how do you determine what in the data created those weights? Every single thing that’s in the training data had some kind of effect on everything else.

      It’s like combining billions of buckets of water together in a pool and then taking out 1 cup from that and trying to figure out which buckets contributed to that cup. It doesn’t make any sense.

      • EnderMB@lemmy.world
        link
        fedilink
        arrow-up
        11
        arrow-down
        1
        ·
        2 days ago

        Respectfully, I worked for Alexa AI on compositional ML, and we were largely able to do exactly this with customer utterances, so to say it is impossible is simply not true. Many companies have to have some degree of ability to remove troublesome data, and while tracing data inside a model is rather difficult (historically it would be done during the building of datasets or measured at evaluation time) it’s definitely something that most big tech companies will do.

      • socphoenix@midwest.social
        link
        fedilink
        arrow-up
        6
        arrow-down
        3
        ·
        2 days ago

        It’s not impossible lol. All a company would need to do is keep track of where they were getting content. If I use a script to download as much of the internet as possible and end up with a bunch of copyrighted content I could still get in trouble, hell there was even a guy arrested for downloading jstor without authorization.. Stop letting these guys get away with crimes just because you like the idea of the end product

  • ___@l.djw.li
    link
    fedilink
    English
    arrow-up
    11
    arrow-down
    1
    ·
    2 days ago

    I went into a smidge more detail over on my Mastodon last night, but my response is summed up as “WTAF? No! Freeware is an explicit license, as anyone from the BBS days will recall.”