Many domain specific characteristics of the data are used for fraud detection. For example, it is well known that large absolute values of the transaction amounts may correspond to anomalies. The most common technique is to build user profiles on short segments of transaction sequences. Typically, the ordering among a short segment of the transactions is immaterial. If desired, a single transaction of the user can also be used. Either a single transaction or a short sequence of transactions can be converted into a feature vector, which is compared to the user’s profile. The key is to design a similarity function, which can encode the wide diversity of attribute types, the collective profile within a short segment, and domain-specific knowledge (eg. higher values of transactions or sudden bursts of high-value transactions are more likely to be fraudulent).
在欺诈检测中,会使用很多与具体领域相关的数据特征。例如,众所周知:如果交易额的绝对值很大,就可能意味着存在异常情况。最常见的方法是使用短交易段和交易序列来创建用户资料。通常情况下,短交易段的排序是不重要的。如有必要,还可以利用用户的统一交易信息。统一交易或短交易序列都可以转换为一个特征矢量,并与用户的资料进行比较。其重点是设计一个类似函数,该函数能够对多种属性、一个短交易段内的整合资料、以及与具体领域相关的知识进行编码(比如在出现较高交易值或突然出现高价值交易的情况下,存在欺诈的概率更高)。