Tema: Asirra
Pogledajte određenu poruku
Staro 19. 03. 2007.   #5
Petar Marić
Python Ambassador
Master
 
Avatar Petar Marić
 
Datum učlanjenja: 06.06.2005
Lokacija: Novi Sad
Poruke: 602
Hvala: 28
27 "Hvala" u 17 poruka
Petar Marić će postati "faca" uskoro
Pošaljite ICQ poruku za Petar Marić
Default

Off Topic: Heh, u zadnje vreme baš se mislim da mi master rad bude u vezi sigurnosti različith klasa CAPTCHA testova

Prednosti:
* Teži za računare od tekstualne CAPTCHA-e
* Pristojna baza slika

Mane:
* Kvalitet slika - na nekima je nejasno šta je u stvari na njima - što zbog kvaliteta slike, što zbog same živuljke
* Cena - znajući MS ova usluga neće još dugo biti besplatna
* Sporo - u odnosu na uobičajne CAPTCHA-e treba dovući 12x više slika
* Različitost - pitanje je koliko je vremena prosečnom korisniku potrebno da shvati šta se od njega traži, naročito ako ne zna engleski.
* Sličnost - bez obzira na zaštitne mehanizme i ovakav sistem se da razbiti: matematika + dobar AI + bot-net + brute-force + jeftina ljudska radna snaga. Pitanje je samo želje i novca.


A da ovaj unos ne bi bio čisto teoritisanje evo malo koda:
Kôd:
"""
A script for testing Asirra CAPTCHA (http://research.microsoft.com/asirra/) security

The idea behind this script is pretty simple:
  * First, let's say we have a way to learn which animal is on a picture (cheap
    human labor or AI - neural network).
  * Then, with the help of this script, we calculate how much of CAPTCHA requests
    we need to make in order to have a specific amount of animal pictures.
  * After we have a rough estimate of the needed requests we employ our bot-network
    to get the pictures.
  * We classify the pictures using the selected learning algo.
  * Now we train our Agent (AI) to recognize the rest of unknown pictures. 
  * CAPTCHA PASSED :)  
"""
__author__ = 'Petar Maric - http://www.petarmaric.com/'

TOTAL_PICTURES = 2*10**6 # They say "It's powered by over two million photos" 
PICTURES_PER_VIEW = 4*3  # CAPTCHA test picutre grid is 4x3 pictures

# List of how much CAPTCHA requests to make
TRIES_LIST = xrange(5*10**4, 5*10**5, 5*10**4)

###############################
# You can look, but no touching 
###############################

import random

ALL_PICTURES = xrange(TOTAL_PICTURES)

def num_pictures_learned(num_tries):
    """Returns the number of learned pictures"""
    pictures_learned = {}
    for i in xrange(num_tries):
        for pic in random.sample(ALL_PICTURES, PICTURES_PER_VIEW):
            pictures_learned[pic] = 0
    return len(pictures_learned)

def main():
    for num_tries in TRIES_LIST:
        learned = num_pictures_learned(num_tries)
        learned_percent = 100.0 * learned/TOTAL_PICTURES 
        print "Learned %d/%d (%.2f%%) with %d tries." % (
            learned,
            TOTAL_PICTURES,
            learned_percent,
            num_tries
        )

if __name__ == "__main__":
    main()
Rezultat:
Kôd:
Learned 518207/2000000 (25.91%) with 50000 tries.
Learned 901901/2000000 (45.10%) with 100000 tries.
Learned 1187203/2000000 (59.36%) with 150000 tries.
Learned 1396808/2000000 (69.84%) with 200000 tries.
Learned 1553736/2000000 (77.69%) with 250000 tries.
Learned 1669414/2000000 (83.47%) with 300000 tries.
Learned 1754926/2000000 (87.75%) with 350000 tries.
Learned 1818577/2000000 (90.93%) with 400000 tries.
Learned 1865694/2000000 (93.28%) with 450000 tries.
__________________
Python Ambassador of Serbia

Poslednja izmena od Petar Marić : 19. 03. 2007. u 01:33.
Petar Marić je offline   Odgovorite uz citat