• 2 Posts
  • 30 Comments
Joined 1 year ago
cake
Cake day: June 14th, 2023

help-circle
  • But then it does go on to quote materials verbatim, which shows it’s not “just” ‘extracting patterns’.

    Is is just extracting patterns. Is making statistical samples of which token (“word”, informally speaking) is likely followed given the previous stream.

    It can only reproduce passages of things it has seen many, many times. I cannot reproduce the whole work. Those two quotes can be seen elsewhere on the internet plenty of times. And it’s fair use there, so it would be fair use with a chat bot as well.

    There have been papers published where researchers were able to regenerate an image that was present in the training set of Stable Diffusion. But they were only able to find that image (and others) in particular, because they were present in the training set multiple times, and the caption was the same (it was the portrait picture of some executive at a company).

    when given the book and pages — quote copyrighted works

    Yeah, you are not gonna be able to do that with an LLM. They will be able to quote only some passages, and only of popular books that have been quoted often enough.

    Even if they started to use my service to literally copy entire books?

    You cannot do that with an LLM.

    Why are you defending massive corporations who could just pay up? Isn’t the whole “corporations putting profits over anything” thing a bit… seen already?

    I hate that some corporations are burning money, resources and energy on this, and the solution is not to restrict fair use even further. Machine Learning is complex, but if I had to summarize in some way is “just” gathering statistics of which word comes next (in the case of a text model). This is no different than getting a large corpus of text, and sample it for word frequency, letter frequency, N-gram frequency, etc. It is well known that this is fair use. You only store the copyrighted works to run the software and produce a very transformative work that is a summary many orders of magnitude smaller than the copyrighted work. This is fair use, and it should still be. Changing that is gonna harm the public, small companies and independent researchers way more than big tech companies.

    As I said in another comment, I would very much welcome a way to force big corpos to release their models. Make a model bigger than N parameters? You needed too much fair use in one gulp: your model has to be public, and in the public domain. I would fucking welcome that! But going in the opposite direction is just risky.

    I don’t understand why small individuals think that copyright is their friend, and will protect them from big tech companies. Copyright will always harm the weak and protect the powerful as a net result. It’s already a miracle that we can enjoy free software and culture by licenses that leverage copyright in our favor.


  • Also preemptively deciding that me disagreeing with you automatically makes you right because you predicted your explanation wouldn’t satisfy me is just A-tier bullshit.

    I predicted that I would waste my time by replying to you, and I predicted right.

    I wanted to give it a chance, though, because Lemmy is a place that is friendly enough and that I want to thrive, despite how little I contribute. I tried to be constructive and explain things the best I could, and assume the best possible faith, etc. When you just say that I sound like an asshole, and completely act in bad faith in how russian roulette is supposed to be in the context of someone who says “you can beat me at any game”, now I feel the urge to try the block feature in Lemmy, sorry.


  • Two people go on a date. The date is going well, there is chemistry between the two people. One says “if you beat me at any game we can have sex”. The two people will typically play a board or card game, and will flirt with the opportunity of sex during the game play, which is gonna be fun and exciting. Seems a good plot idea for your average romantic comedy movie or teenager’s series.

    Now the joke is that the choice of game is stupid because you end up killing your date. Just with that you could make a meme/joke. Now the post is doubling down on the stupidity, insanity, etc., by making it morbid and showing that the guy still had sex with the corpse.

    Here it is. My take on the issue, which is unlikely to be the only possible explanation which is not “incel shit”. I’ve wasted 10 minutes of my time, and you’ll likely will still not agree with me, and will prove valid my first comment.

    Cheers.


  • suy@programming.devtoLemmy Shitpost@lemmy.worldWinner's Luck
    link
    fedilink
    arrow-up
    15
    arrow-down
    2
    ·
    6 months ago

    Has it occurred to you that pressing the downvote button is just much easier that having to bother explaining something that should be obvious?

    If it is not obvious to you that it’s not incel shit, maybe even after an explanation you won’t agree still because you have different views (which I’m not saying are not respectable, but are still different, so an agreement can’t be reached), so whoever replies to you would have wasted their time.

    So of course people downvote without replying.


  • Yes. There is already an answer with many votes saying so, but I’ll add myself to the list.

    I don’t have to like all the language, and not even all of the standard library. I learnt C++ with the Qt library, and I still do 99% of my development using Qt because it’s the kind of software that I like to write the most. I can choose the parts that I like the most about the full C++ ecosystem, like most people do (you would have to see how different game development is, for example).

    I’m also learning Rust, and I see nothing wrong with it. It’s just that I see C++ better for the kind of stuff that I need to write (at this time at least).







  • Related: There is an article on LWN called Lua and Python, which is mostly about the approach of the two languages WRT being “batteries included” or not.

    I think Lua being a bit barebones is 100% fine… if you just pair it with a good helper library, or set of libraries with a coherent API, that allows it to thrive. Then you can either use the framework library or not, depending on whether your project requires the extras, or can do without.

    As a parallel, I’ve been doing C++ development for almost two decades, and I cannot imagine doing anything non-trivial without Qt. For example, Qt has a debug framework that pretty prints automatically most containers, and adds the newline also automatically. Also, QString is an actual string type, whereas std::string is more like QByteArray. It’s functionality that it’s essential for me (and it’s just the minimal examples… then Qt has all the GUI functionality, of course, but I use Qt even in console-only programs!).

    This is surely opinionated on my side, and most C++ devs don’t see it this way, but my point is that a language with a “core experience” that it’s lackluster to you should not be a bad thing if the language is capable enough to provide an ecosystem with a good 3rd party library that adds exactly what you want. In the Lua ecosystem that maybe it’s Penlight.

    But I totally get your point. Penlight doesn’t even seem to have a math library, so I found no round implementation there. This can be not a problem for some, but deal breaking for others.


  • suy@programming.devtoFunny@sh.itjust.worksHolland has so much space
    link
    fedilink
    arrow-up
    3
    arrow-down
    5
    ·
    edit-2
    7 months ago

    Normally the good jokes are also somewhat smart, even though they are not “serious”. A joke about Texas being big is not very smart, IMHO. Is also not very original, as it’s not the first one I’ve seen in this vein. And above all, it’s specially stupid to end it with a remark about “The European mind cannot comprehend this”, because Europeans know a lot more about the US than the US people know about Europe.

    IOW, it’s not that it’s struck a nerve, it’s that it was legit bad.

    PS: Oh, and, the fact that it appeals to Europeans, it seems like it appeals on Europe as a whole, which makes it doubly stupid, because then individual members of the EU/continent are like USA states, and then each member/state has routes as long as the one in the original meme.


  • I’d have to dig it, but I think it said that it added the PID and the uninitialized memory to add a bit more data to the entropy pool in a cheap way. I honestly don’t get how that additional data can be helpful. To me it’s the very opposite. The PID and the undefined memory are not as good quality as good randomness. So, even without Debian’s intervention, it was a bad idea. The undefined memory triggered valgrind, and after Debian’s patch, if it weren’t because of the PID, all keys would have been reduced to 0 randomness, which would have probably raised the alarm much sooner.


  • no more patching fuzzers to allow that one program to compile. Fix the program

    Agreed.

    Remember Debian’s OpenSSL fiasco? The one that affected all the other derivatives as well, including Ubuntu.

    It all started because OpenSSL did add to the entropy pool a bunch uninitialized memory and the PID. Who the hell relies on uninitialized memory ever? The Debian maintainer wanted to fix Valgrind errors, and submitted a patch. It wasn’t properly reviewed, nor accepted in OpenSSL. The maintainer added it to the Debian package patch, and then everything after that is history.

    Everyone blamed Debian “because it only happened there”, and definitely mistakes were done on that side, but I surely blame much more the OpenSSL developers.


  • suy@programming.devtoLinux@lemmy.mlXZ backdoor in a nutshell
    link
    fedilink
    arrow-up
    40
    arrow-down
    1
    ·
    7 months ago

    Is it, really? If the whole point of the library is dealing with binary files, how are you even going to have automated tests of the library?

    The scary thing is that there is people still using autotools, or any other hyper-complicated build system in which this is easy to hide because who the hell cares about learning about Makefiles, autoconf, automake, M4 and shell scripting at once to compile a few C files. I think hiding this in any other build system would have been definitely harder. Check this mess:

      dnl Define somedir_c_make.
      [$1]_c_make=`printf '%s\n' "$[$1]_c" | sed -e "$gl_sed_escape_for_make_1" -e "$gl_sed_escape_for_make_2" | tr -d "$gl_tr_cr"`
      dnl Use the substituted somedir variable, when possible, so that the user
      dnl may adjust somedir a posteriori when there are no special characters.
      if test "$[$1]_c_make" = '\"'"${gl_final_[$1]}"'\"'; then
        [$1]_c_make='\"$([$1])\"'
      fi
      if test "x$gl_am_configmake" != "x"; then
        gl_[$1]_config='sed \"r\n\" $gl_am_configmake | eval $gl_path_map | $gl_[$1]_prefix -d 2>/dev/null'
      else
        gl_[$1]_config=''
      fi
    

  • I’m not fully sure what the intent of the joke is, but note that yes, it’s true that a header typically just has the prototype. However, tons of more advanced libraries are “header-only”. Everything is in a single header originally, in development, or it’s a collection of headers (that optionally gets “amalgamated” as a single header). This is sometimes done intentionally to simplify integration of the library (“just copy this files to your repo, or add it as a submodule”), but sometimes it’s entirely necessary because the code is just template code that needs to be in a header.

    C++ 20 adds modules, and the situation is a bit more involved, but I’m not confident enough of elaborating on this. :) Compile times are much better, but it’s something that the build system and the compilers needs to support.




  • I’ve wanted to start a project in Rust, but for the ideas that I have (and the time that I have for a hobby project, as for work it’s rarely starting a new one, but continuing and existing one), Rust seemed a viable, but not ideal alternative to just doing it all in C++, for which I already have enough knowledge and very well proven libraries. I will look again soon, and I will keep looking because eventually something will surely click, it’s just that so far, the time has not been right.

    Note that my point is not that it’s unusable for everyone. Just that it’s false that “some people just can’t seem to let [C or C++] go”, as the previous comment said. I can’t let go something that works well for something that doesn’t, given the projects that I have to work on.


  • It’s just time to move on from C/C++, but some people just can’t seem to let go.

    The Rust community has 2 websites that I keep periodically checking: Are we game yet? and Are we GUI yet?. The answers on those sites are respectively (as of February 2024, when this comment is written) “Almost. We have the blocks, bring your own glue” and “The roots aren’t deep but the seeds are planted”. I’ve seen the progress in Bevy and Slint, but it’s still the same, those websites don’t change, and my situation WRT to making a Rust project for fun or work it’s the same.

    I’ll be happy to start doing Rust projects whenever I get the chance (which will be when it’s a sufficient tool for my use cases). But I’m tired of smoke sellers.