Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal for Fraud Detection Capabilities #4

Open
yarnehermann opened this issue Jun 24, 2022 · 3 comments
Open

Proposal for Fraud Detection Capabilities #4

yarnehermann opened this issue Jun 24, 2022 · 3 comments

Comments

@yarnehermann
Copy link

yarnehermann commented Jun 24, 2022

This proposal lists capabilities identified as useful for a number of agreed upon fraud-related use cases.

Recognize whether the same device is seen again in the context of the same identity

This applies to both within the same domain or across domains. Detecting a same device returning would traditionally happen through storing IDs persistently in the browser and reusing them when a device is seen again. The lifespan of detection needs to be at a minimum 7 days.

Use cases

  • Account Creation: Seeing an identity being used with a device it has never been seen with before (or that's very different from the devices it has been seen with before) can indicate identity theft where a fraudster uses a known identity to open new accounts.
  • Account Takeover: An account that is taken over will be accessed from a new device, i.e. a device it hasn’t been used with before. This can be treated as an additional risk compared to devices that have been regularly seen with said account. The lifespan during which a device can be recognized as ‘seen again’ is very relevant here, as shorter windows will make every account access look as if it came from a never-before-seen device.
  • 1st Party Fraud: A person committing 1st party fraud and then claiming to be a victim of identity theft can in some cases be detected by recognizing that their own device was used to commit said fraud.

Recognize whether the same device is seen again in the context of multiple identities

Similar to the above, but recognizing the usage of the same device with different identities

Use cases

  • Account Creation: Fraudsters often steal multiple identities. Seeing many different identities’ data coming from the same device is a high risk signal, since legitimate users mostly use a single identity with their device.
  • Account Takeover: A similar point to the above holds for account takeover when e.g. a fraudster is using a list of leaked usernames and passwords to takeover multiple victims' accounts. Recognizing that these logins into multiple accounts are happening from the same device can be used to indicate high risk.

Retrieve a device’s IP address

Knowledge of the IP address offers many benefits in the fight against identity fraud:

  • VPN/TOR detection: Many cases of fraud are committed from devices on VPNs. This is therefore a valuable risk signal.
  • Relatively unique identifier: Similar to the use cases above. E.g. Account Creation: seeing unusual numbers of account creations from the same IP or IP range indicates risk; Account Takeover: seeing an identity with an IP address that it has never been associated with. These are ways in which knowledge of the IP address can contribute to risk insights.
  • IP block nature: If certain IPs within a block have been associated with fraud, the overarching block can be preemptively labeled as more risky.
  • Approximate location information: IP addresses are associated with geolocation information. During account creation if an IPs geolocation is far removed from the submitted address this indicates higher risk of fraud.

Know the geographic location of the device

This refers to geolocation information through GPS / Wi-Fi triangulation, as this offers additional confidence over only IP-based geolocation, particularly when proxies are being used.

Use cases

  • Account Creation: Similar to how IP geolocation helps, discrepancies between device geolocation and a submitted address increase fraud risk.
  • Account Takeover: When an account is taken over it generally happens from a different location than where the victim usually has been using their account. Also, impossibly fast jumps in geolocation between two logins for an identity are indicators of account takeover. (I.e. a person logging in from New York and 5 minutes later from San Francisco is not physically possible)
  • 1st Party Fraud: A person committing first-party fraud can change their IP address but still be in the same physical location. Detecting fraud being committed from the same location is an indicator of 1st party Fraud.

Know that the device is a real device and its type

This includes detecting the device’s type and whether or not it is running in a simulator.

Knowledge of the device types used improves accuracy of fraud detection:

  • Detecting the device types used by a fraud ring improves accuracy when filtering e.g. an IP block, in order to avoid blocking good users who are not in the fraud ring.
  • Detecting the device types or brands used by a good identity informs risk when unexpected device types or brands are used for that identity. (e.g. traditional Chromebook user switching to a Macbook)

Know that a human user is interacting with the device

This comes down to having the capabilities required to detect bots/headless browsers/scripts.

Use Cases

  • Account Creation: Fraudulent account creation is often automated and can employ bots. Identifying an automated user from a human user is a powerful risk indicator.
  • Account Takeover: A large-scale account takeover attack can be automated. Also, account takeover can be bootstrapped through a set of logins and snooping before stealing funds. This process of logins and snooping can also be automated.
@samuel-t-jackson
Copy link
Collaborator

Interesting. I think we might also want to include something like 'know whether the same device is seen repeatedly in the context of multiple identities' - essentially velocity features for device. Another one that comes to mind is knowing the tenure of a device on a given network or in association with an identifier such as email.

@michaelficarra
Copy link
Member

I like all of these suggestions, but the geolocation one is tricky. Since GPS-based or wifi-based information is effectively self-reported from untrusted sensors that can be manipulated, I don't think we could ever rely on it for antifraud purposes. I could see deriving coarse geo information based on ping from trusted servers in known locations, possibly.

@samuel-t-jackson
Copy link
Collaborator

Yea, location today is pretty fickle, but when it works, it works well. IP location is more useful than self-reported device location. Today there is a lot of commercially available metadata available regarding IP addresses that enables orgs to cut through the noise (for example, a known residential IP will have a lot more value in terms of location than an MNO NAT IP). Still, it is probably reliable < 50% of the time.

That's true of many browser signals, and one of the reasons why ML is so essential for fraud prevention - hundreds of features go into models, and that breadth of information leads to resilient predictions that would not otherwise be viable.

In general I am concerned that if we replace the breadth of information that current supports these use cases with a few well-defined, official/sanctioned methods, it will not be nearly as effective.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants