A big biometric security company in the UK, Facewatch, is in hot water after their facial recognition system caused a major snafu - the system wrongly identified a 19-year-old girl as a shoplifter.

  • PseudorandomNoise@lemmy.world
    link
    fedilink
    English
    arrow-up
    244
    arrow-down
    2
    ·
    1 month ago

    Despite concerns about accuracy and potential misuse, facial recognition technology seems poised for a surge in popularity. California-based restaurant CaliExpress by Flippy now allows customers to pay for their meals with a simple scan of their face, showcasing the potential of facial payment technology.

    Oh boy, I can’t wait to be charged for someone else’s meal because they look just enough like me to trigger a payment.

    • Cethin@lemmy.zip
      link
      fedilink
      English
      arrow-up
      168
      ·
      1 month ago

      I have an identical twin. This stuff is going to cause so many issues even if it worked perfectly.

      • Maggoty@lemmy.world
        link
        fedilink
        English
        arrow-up
        2
        arrow-down
        1
        ·
        edit-2
        1 month ago

        Not uh! All you and your twin have to do is write the word Twin on your forehead every morning. Just make sure to never commit a crime with it written where your twin puts their sign. Or else, you know… You might get away with it.

        Nope no obvious problems here at all!

      • CeeBee@lemmy.world
        link
        fedilink
        English
        arrow-up
        12
        arrow-down
        28
        ·
        1 month ago

        Ok, some context here from someone who built and worked with this kind tech for a while.

        Twins are no issue. I’m not even joking, we tried for multiple months in a live test environment to get the system to trip over itself, but it just wouldn’t. Each twin was detected perfectly every time. In fact, I myself could only tell them apart by their clothes. They had very different styles.

        The reality with this tech is that, just like everything else, it can’t be perfect (at least not yet). For all the false detections you hear about, there have been millions upon millions of correct ones.

          • CeeBee@lemmy.world
            link
            fedilink
            English
            arrow-up
            6
            arrow-down
            17
            ·
            1 month ago

            Yes, because like I said, nothing is ever perfect. There can always be a billion little things affecting each and every detection.

            A better statement would be “only one false detection out of 10 million”

            • Zron@lemmy.world
              link
              fedilink
              English
              arrow-up
              36
              ·
              1 month ago

              You want to know a better system?

              What if each person had some kind of physical passkey that linked them to their money, and they used that to pay for food?

              We could even have a bunch of security put around this passkey that makes it’s really easy to disable it if it gets lost or stolen.

              As for shoplifting, what if we had some kind of societal system that levied punishments against people by providing a place where the victim and accused can show evidence for and against the infraction, and an impartial pool of people decides if they need to be punished or not.

              • CeeBee@lemmy.world
                link
                fedilink
                English
                arrow-up
                11
                arrow-down
                2
                ·
                1 month ago

                100%

                I don’t disagree with a word you said.

                FR for a payment system is dumb.

            • fishpen0@lemmy.world
              link
              fedilink
              English
              arrow-up
              12
              arrow-down
              1
              ·
              edit-2
              1 month ago

              Another way to look at that is ~810 people having an issue with a different 810 people every single day assuming only one scan per day. That’s 891,000 people having a huge fucking problem at least once every single year.

              I have this problem with my face in the TSA pre and passport system and every time I fly it gets worse because their confidence it is correct keeps going up and their trust in my actual fucking ID keeps going down

              • CeeBee@lemmy.world
                link
                fedilink
                English
                arrow-up
                6
                arrow-down
                4
                ·
                edit-2
                1 month ago

                I have this problem with my face in the TSA pre and passport system

                Interesting. Can you elaborate on this?

                Edit: downvotes for asking an honest question. People are dumb

        • MonkderDritte@feddit.de
          link
          fedilink
          English
          arrow-up
          18
          arrow-down
          1
          ·
          1 month ago

          it can’t be perfect (at least not yet).

          Or ever, because it locks you out after a drunken night otherwise.

          • CeeBee@lemmy.world
            link
            fedilink
            English
            arrow-up
            8
            arrow-down
            3
            ·
            1 month ago

            Or ever because there is no such thing as 100% in reality. You can only add more digits at the end of your accuracy, but it will never reach 100.

        • boatswain@infosec.pub
          link
          fedilink
          English
          arrow-up
          14
          ·
          1 month ago

          In fact, I myself could only tell them apart by their clothes. They had very different styles.

          This makes it sound like you only tried one particular set of twins–unless there were multiple sets, and in each set the two had very different styles? I’m no statistician, but a single set doesn’t seem statistically significant.

          • CeeBee@lemmy.world
            link
            fedilink
            English
            arrow-up
            6
            arrow-down
            2
            ·
            1 month ago

            What I’m saying is we had a deployment in a large facility. It was a partnership with the org that owned the facility to allow us to use their location as a real-world testing area. We’re talking about multiple buildings, multiple locations, and thousands of people (all aware of the system being used).

            Two of the employees were twins. It wasn’t planned, but it did give us a chance to see if twins were a weak point.

            That’s all I’m saying. It’s mostly anecdotal, as I can’t share details or numbers.

            • boatswain@infosec.pub
              link
              fedilink
              English
              arrow-up
              7
              arrow-down
              2
              ·
              1 month ago

              Two of the employees were twins. It wasn’t planned, but it did give us a chance to see if twins were a weak point.

              No, it gave you a chance to see if that particular set of twins was a weak point.

              • CeeBee@lemmy.world
                link
                fedilink
                English
                arrow-up
                4
                ·
                1 month ago

                With that logic we would need to test the system on every living person to see where it fails.

                The system had been tested ad nauseum in a variety of scenarios (including with twins and every other combination you can think of, and many you can’t). In this particular situation, a real-world test in a large facility with many hundreds of cameras everywhere, there happened to be twins.

                It’s a strong data point regardless of your opinion. If it was the only one then you’d have a point. But like I said, it was an anecdotal example.

        • techt@lemmy.world
          link
          fedilink
          English
          arrow-up
          8
          ·
          edit-2
          1 month ago

          Can you please start linking studies? I think that might actually turn the conversation in your favor. I found a NIST study (pdf link), on page 32, in the discussion portion of 4.2 “False match rates under demographic pairing”:

          The results above show that false match rates for imposter pairings in likely real-world scenarios are much higher than those from measured when imposters are paired with zero-effort.

          This seems to say that the false match rate gets higher and higher as the subjects are more demographically similar; the highest error rate on the heat map below that is roughly 0.02.

          Something else no one here has talked about yet – no one is actively trying to get identified as someone else by facial recognition algorithms yet. This study was done on public mugshots, so no effort to fool the algorithm, and the error rates between similar demographics is atrocious.

          And my opinion: Entities using facial recognition are going to choose the lowest bidder for their system unless there’s a higher security need than, say, a grocery store. So, we have to look at the weakest performing algorithms.

          • CeeBee@lemmy.world
            link
            fedilink
            English
            arrow-up
            6
            ·
            1 month ago

            My references are the NIST tests.

            https://pages.nist.gov/frvt/reports/1N/frvt_1N_report.pdf

            That might be the one you’re looking at.

            Another thing to remember about the NIST tests is that they try to use a standardized threshold across all vendors. The point is to compare the results in a fair manner across systems.

            The system I worked on was tested by NIST with an FMR of 1e-5. But we never used that threshold and always used a threshold that equated to 1e-7, which is orders of magnitude more accurate.

            And my opinion: Entities using facial recognition are going to choose the lowest bidder for their system unless there’s a higher security need than, say, a grocery store. So, we have to look at the weakest performing algorithms.

            This definitely is a massive problem and likely does contribute to poor public perception.

            • techt@lemmy.world
              link
              fedilink
              English
              arrow-up
              3
              ·
              1 month ago

              Thanks for the response! It sounds like you had access to a higher quality system than the worst, to be sure. Based on your comments I feel that you’re projecting the confidence in that system onto the broader topic of facial recognition in general; you’re looking at a good example and people here are (perhaps cynically) pointing at the worst ones. Can you offer any perspective from your career experience that might bridge the gap? Why shouldn’t we treat all facial recognition implementations as unacceptable if only the best – and presumably most expensive – ones are?

              A rhetorical question aside from that: is determining one’s identity an application where anything below the unachievable success rate of 100% is acceptable?

              • CeeBee@lemmy.world
                link
                fedilink
                English
                arrow-up
                5
                ·
                1 month ago

                Based on your comments I feel that you’re projecting the confidence in that system onto the broader topic of facial recognition in general; you’re looking at a good example and people here are (perhaps cynically) pointing at the worst ones. Can you offer any perspective from your career experience that might bridge the gap? Why shouldn’t we treat all facial recognition implementations as unacceptable if only the best – and presumably most expensive – ones are?

                It’s a good question, and I don’t have the answer to it. But a good example I like to point at is the ACLU’s announcement of their test on Amazon’s Rekognition system.

                They tested the system using the default value of 80% confidence, and their test resulted in 20% false identification. They then boldly claimed that FR systems are all flawed and no one should ever use them.

                Amazon even responded saying that the ACLU’s test with the default values was irresponsible, and Amazon’s right. This was before such public backlash against FR, and the reasoning for a default of 80% confidence was the expectation that most people using it would do silly stuff like celebrity lookalikes. That being said, it was stupid to set the default to 80%, but that’s just hindsight speaking.

                My point here is that, while FR tech isn’t perfect, the public perception is highly skewed. If there was a daily news report detailing the number of correct matches across all systems, these few showing a false match would seem ridiculous. The overwhelming vast majority of news reports on FR are about failure cases. No wonder most people think the tech is fundamentally broken.

                A rhetorical question aside from that: is determining one’s identity an application where anything below the unachievable success rate of 100% is acceptable?

                I think most systems in use today are fine in terms of accuracy. The consideration becomes “how is it being used?” That isn’t to say that improvements aren’t welcome, but in some cases it’s like trying to use the hook on the back of a hammer as a screw driver. I’m sure it can be made to work, but fundamentally it’s the wrong tool for the job.

                FR in a payment system is just all wrong. It’s literally forcing the use of a tech where it shouldn’t be used. FR can be used for validation if increased security is needed, like accessing a bank account. But never as the sole means of authentication. You should still require a bank card + pin, then the system can do FR as a kind of 2FA. The trick here would be to first, use a good system, and then second, lower the threshold that borders on “fairly lenient”. That way you eliminate any false rejections while still maintaining an incredibly high level of security. In that case the chances of your bank card AND pin being stolen by someone who looks so much like you that it tricks FR is effectively impossible (but it can never be truly zero). And if that person is being targeted by a threat actor who can coordinate such things then they’d have the resources to just get around the cyber security of the bank from the comfort of anywhere in the world.

                Security in every single circumstance is a trade-off with convenience. Always, and in every scenario.

                FR works well with existing access control systems. Swipe your badge card, then it scans you to verify you’re the person identified by the badge.

                FR also works well in surveillance, with the incredibly important addition of human-in-the-loop. For example, the system I worked on simply reported detections to a SoC (with all the general info about the detection including the live photo and the reference photo). Then the operator would have to look at the details and manually confirm or reject the detection. The system made no decisions, it simply presented the info to an authorized person.

                This is the key portion that seems to be missing in all news reports about false arrests and whatnot. I’ve looked into all the FR related false arrests and from what I could determine none of those cases were handled properly. The detection results were simply taken as gospel truth and no critical thinking was applied. In some of those cases the detection photo and reference (database) photo looked nothing alike. It’s just the people operating those systems are either idiots or just don’t care. Both of those are policy issues entirely unrelated to the accuracy of the tech.

                • techt@lemmy.world
                  link
                  fedilink
                  English
                  arrow-up
                  2
                  ·
                  1 month ago

                  The mishandling is indeed what I’m concerned about most. I now understand far better where you’re coming from, sincere thanks for taking the time to explain. Cheers

                • hazeebabee@slrpnk.net
                  link
                  fedilink
                  English
                  arrow-up
                  2
                  ·
                  1 month ago

                  Super interesting to read your more technical perspective. I also think facial recognition (and honestly most AI use cases) are best when used to supplement an existing system. Such as flagging a potential shoplifter to human security.

                  Sadly most people don’t really understand the tech they use for work. If the computer tells them something they just kind of blindly believe it. Especially in a work environment where they have been trained to do what the machine says.

                  My guess is that the people were trained on how to use the system at a very basic level. Troubleshooting and understanding the potential for error typically isn’t covered in 30min corporate instructional meetings. They just get a little notice saying a shoplifter is in the store and act on that without thinking.

        • Cethin@lemmy.zip
          link
          fedilink
          English
          arrow-up
          5
          ·
          1 month ago

          This tech (AI detection) or purpose built facial recognition algorithms?

      • Telodzrum@lemmy.world
        link
        fedilink
        English
        arrow-up
        5
        arrow-down
        26
        ·
        1 month ago

        If it works anything like Apple’s Face ID twins don’t actually map all that similar. In the general population the probability of matching mapping of the underlying facial structure is approximately 1:1,000,000. It is slightly higher for identical twins and then higher again for prepubescent identical twins.

        • MonkderDritte@feddit.de
          link
          fedilink
          English
          arrow-up
          39
          ·
          edit-2
          1 month ago

          Meaning, 8’000 potential false positives per user globally. About 300 in US, 80 in Germany, 7 in Switzerland.

          Might be enough for Iceland.

          • Telodzrum@lemmy.world
            link
            fedilink
            English
            arrow-up
            2
            arrow-down
            34
            ·
            1 month ago

            Yeah, which is a really good number and allows for near complete elimination of false matches along this vector.

            • 4am@lemm.ee
              link
              fedilink
              English
              arrow-up
              26
              arrow-down
              1
              ·
              1 month ago

              I promise bro it’ll only starve like 400 people please bro I need this

              • Telodzrum@lemmy.world
                link
                fedilink
                English
                arrow-up
                2
                arrow-down
                15
                ·
                1 month ago

                No you misunderstood. That is a reduction in commonality by a literal factor of one million. Any secondary verification point is sufficient to reduce the false positive rate to effectively zero.

                • AwesomeLowlander@lemmy.dbzer0.com
                  link
                  fedilink
                  English
                  arrow-up
                  18
                  ·
                  1 month ago

                  secondary verification point

                  Like, running a card sized piece of plastic across a reader?

                  It’d be nice if they were implementing this to combat credit card fraud or something similar, but that’s not how this is being deployed.

                • BassTurd@lemmy.world
                  link
                  fedilink
                  English
                  arrow-up
                  15
                  ·
                  1 month ago

                  Which means the face recognition was never necessary. It’s a way for companies to build a database that will eventually get exploited. 100% guarantee.

        • Cethin@lemmy.zip
          link
          fedilink
          English
          arrow-up
          22
          arrow-down
          1
          ·
          1 month ago

          Yeah, people with totally different facial structures get identified as the same person all the time with the “AI” facial recognition, especially if your darker skinned. Luckily (or unluckily) I’m white as can be.

          I’m assuming Apple’s software is a purpose built algorithm that detects facial features and compares them, rather than the black box AI where you feed in data and it returns a result. Thats the smart way to do it, but it takes more effort.

          • CeeBee@lemmy.world
            link
            fedilink
            English
            arrow-up
            2
            ·
            1 month ago

            people with totally different facial structures get identified as the same person all the time with the “AI” facial recognition

            All the time, eh? Gonna need a citation on that. And I’m not talking about just one news article that pops up every six months. And nothing that links back to the UCLA’s 2018 misleading “report”.

            I’m assuming Apple’s software is a purpose built algorithm that detects facial features and compares them, rather than the black box AI where you feed in data and it returns a result.

            You assume a lot here. People have this conception that all FR systems are trained blackbox models. This is true for some systems, but not all.

            The system I worked with, which ranked near the top of the NIST FRVT reports, did not use a trained AI algorithm for matching.

            • Cethin@lemmy.zip
              link
              fedilink
              English
              arrow-up
              1
              arrow-down
              1
              ·
              1 month ago

              I’m not doing a bunch of research to prove the point. I’ve been hearing about them being wrong fairly frequently, especially on darker skinned people, for a long time now. It doesn’t matter how often it is. It sounds like you have made up your mind already.

              I’m assuming that of apple because it’s been around for a few years longer than the current AI craze has been going on. We’ve been doing facial recognition for decades now, with purpose built algorithms. It’s not mucb of leap to assume that’s what they’re using.

              • CeeBee@lemmy.world
                link
                fedilink
                English
                arrow-up
                2
                ·
                1 month ago

                I’ve been hearing about them being wrong fairly frequently, especially on darker skinned people, for a long time now.

                I can guarantee you haven’t. I’ve worked in the FR industry for a decade and I’m up to speed on all the news. There’s a story about a false arrest from FR at most once every 5 or 6 months.

                You don’t see any reports from the millions upon millions of correct detections that happen every single day. You just see the one off failure cases that the cops completely mishandled.

                I’m assuming that of apple because it’s been around for a few years longer than the current AI craze has been going on.

                No it hasn’t. FR systems have been around a lot longer than Apple devices doing FR. The current AI craze is mostly centered around LLMs, object detection and FR systems have been evolving for more than 2 decades.

                We’ve been doing facial recognition for decades now, with purpose built algorithms. It’s not mucb of leap to assume that’s what they’re using.

                Then why would you assume companies doing FR longer than the recent “AI craze” would be doing it with “black boxes”?

                I’m not doing a bunch of research to prove the point.

                At least you proved my point.

                • Cethin@lemmy.zip
                  link
                  fedilink
                  English
                  arrow-up
                  2
                  ·
                  1 month ago

                  You don’t see any reports from the millions upon millions of correct detections that happen every single day. You just see the one off failure cases that the cops completely mishandled.

                  Obviously. I don’t have much of an issue with it when it’s working properly (although I do still absolutely have an issue with it still). It being wrong and causing issues fairly frequently, and every 5 or 6 months is frequent (this is a low number, just the frequency of it causing newsworthy issues) with it not being deployed widely yet, is a pretty big issue. Scale that up by several orders of magnitude if it’s widely adopted and the errors will be constant.

                  No it hasn’t. FR systems have been around a lot longer than Apple devices doing FR. The current AI craze is mostly centered around LLMs, object detection and FR systems have been evolving for more than 2 decades… Then why would you assume companies doing FR longer than the recent “AI craze” would be doing it with “black boxes”?

                  You’re repeating what I said. Apples FR tech is a few years older than the machine learning tech that we have now. FR in general is several decades old, and it’s not ML based. It’s not a black box. You can actually know what it’s doing. I specifically said they weren’t doing it with black boxes. I said the AI models are. Please read again before you reply.

                  At least you proved my point.

                  You wrongly assuming what I said, which is actually the opposite of what I said, is the reason I’m not putting in the effort. You’ve made up your mind. I’m not going to change it, so I’m not putting in the effort it would take to gather the data, just to throw it into the wind. It sounds like you are already aware of some of it, but somehow think it’s not bad.

        • 4am@lemm.ee
          link
          fedilink
          English
          arrow-up
          21
          arrow-down
          2
          ·
          1 month ago

          And yet this woman was mistaken for a 19-year-old 🤔

          • Telodzrum@lemmy.world
            link
            fedilink
            English
            arrow-up
            2
            arrow-down
            18
            ·
            1 month ago

            Shitty implementation doesn’t mean shitty concept, you’d think a site full of tech nerds would understand such a basic concept.

            • Hawk@lemmy.dbzer0.com
              link
              fedilink
              English
              arrow-up
              0
              ·
              1 month ago

              Pretty much everyone here agrees that it’s a shitty concept. Doesn’t solve anything and it’s a privacy nightmare.

        • chiisana@lemmy.chiisana.net
          link
          fedilink
          English
          arrow-up
          8
          ·
          1 month ago

          I think from a purely technical point of view, you’re not going to get FaceID kind of accuracy on theft prevention systems. Primarily because FaceID uses IR array scanning within arm’s reach from the user, whereas theft prevention is usually scanned from much further away. The distance makes it much harder to get the fidelity of data required for an accurate reading.

          • sugar_in_your_tea@sh.itjust.works
            link
            fedilink
            English
            arrow-up
            5
            ·
            1 month ago

            Yup, it turns out if you have millions of pixels to work with, you have a better shot at correctly identifying someone than if you have dozens.

          • CeeBee@lemmy.world
            link
            fedilink
            English
            arrow-up
            1
            ·
            1 month ago

            I think from a purely technical point of view, you’re not going to get FaceID kind of accuracy on theft prevention systems. Primarily because FaceID uses IR array scanning within arm’s reach from the user, whereas theft prevention is usually scanned from much further away. The distance makes it much harder to get the fidelity of data required for an accurate reading.

            This is true. The distance definitely makes a difference, but there are systems out there that get incredibly high accuracy even with surveillance footage.

      • Thassodar@lemm.ee
        link
        fedilink
        English
        arrow-up
        10
        ·
        1 month ago

        Shit even the motion sensors on the automated sinks have trouble recognizing dark skinned people! You have to show your palm to turn the water on most times!

      • nyan@lemmy.cafe
        link
        fedilink
        English
        arrow-up
        8
        ·
        1 month ago

        Technically, there’s a tendency for them to be trained on datasets that don’t include nearly enough dark-skinned people. As a result, they don’t learn to make the necessary distinctions. I’d like to think that the selection of datasets for training facial recognition AI has improved since the most egregious cases of that. I’m not willing to bet on it, though.

      • CeeBee@lemmy.world
        link
        fedilink
        English
        arrow-up
        6
        arrow-down
        29
        ·
        1 month ago

        No they aren’t. This is the narrative that keeps getting repeated over and over. And the citation for it is usually the ACLU’s test on Amazon’s Rekognition system, which was deliberately flawed to produce this exact outcome (people years later still saying the same thing).

        The top FR systems have no issues with any skin tones or connections.

          • CeeBee@lemmy.world
            link
            fedilink
            English
            arrow-up
            6
            arrow-down
            13
            ·
            edit-2
            1 month ago

            I promise I’m more aware of all the studies, technologies, and companies involved. I worked in the industry for many years.

            The technical studies you’re referring to show that the difference between a white man and a black woman (usually polar opposite in terms of results) is around 0.000001% error rate. But this usually gets blown out of proportion by media outlets.

            If you have white men at 0.000001% error rate and black women at 0.000002% error rate, then what gets reported is “facial recognition for black women is 2 times worse than for white men”.

            It’s technically true, but in practice it’s a misleading and disingenuous statement.

            Edit: here’s the actual technical report if anyone is interested

            https://pages.nist.gov/frvt/reports/1N/frvt_1N_report.pdf

            • AwesomeLowlander@lemmy.dbzer0.com
              link
              fedilink
              English
              arrow-up
              11
              ·
              1 month ago

              Would you kindly link some studies backing up your claims, then? Because nothing I’ve seen online has similar numbers to what you’re claiming

                • Richard@lemmy.world
                  link
                  fedilink
                  English
                  arrow-up
                  5
                  ·
                  1 month ago

                  It saddens me that you are being downvoted for providing a detailed factual report from an authoritative source. I apologise in the name of all Lemmy for these ignorant people

                  • CeeBee@lemmy.world
                    link
                    fedilink
                    English
                    arrow-up
                    2
                    ·
                    1 month ago

                    Ya, most upvotes and downvotes are entirely emotionally driven. I knew I would get downvoted for posting all this. It happens on every forum, Reddit post, and Lemmy post. But downvotes don’t make the info I share wrong.

                  • CeeBee@lemmy.world
                    link
                    fedilink
                    English
                    arrow-up
                    2
                    ·
                    1 month ago

                    Np.

                    As someone else pointed out in another comment. I’ve been saying the x% accuracy number incorrectly. It’s just a colloquial way of conveying the accuracy. The truth is that no one in the industry uses “percent accuracy” and instead use FMR (false match rate) and FNMR (false non-match rate) as well as some other metrics.

    • Liz@midwest.social
      link
      fedilink
      English
      arrow-up
      9
      ·
      1 month ago

      I have come across a stranger online who looks exactly like me. We even share the same first name. We even live in the same area. I’m so excited for this wonderful new technology…

      ಠ⁠_⁠ಠ

    • TrickDacy@lemmy.world
      link
      fedilink
      English
      arrow-up
      6
      ·
      1 month ago

      Are we assuming there is no pin or any other auth method? That would be unlike any other payment system I’m aware of. I have to fingerprint scan on my phone to use my credit cards even though I just unlocked my phone to attempt it

      • MajorHavoc@programming.dev
        link
        fedilink
        English
        arrow-up
        2
        ·
        1 month ago

        I’ve got an extremely common-looking face in a major city.

        Indeed, it’s likely to be a problem, if you stick with committing few or no crimes.

        The good news is that, should you choose to commit an above-average number of crimes, then the system will be prone to under-report you.

        So that’s nice. /sarcasm, I’m not actually advocating for more crimes. Though I am pointing out that maybe the folks installing these things aren’t incentivising the behavior they want.