API reference

exception pylookyloo.AuthError
class pylookyloo.CaptureSettings

The capture settings that can be passed to Lookyloo.

class pylookyloo.CompareSettings

The settings that can be passed to the compare method on lookyloo side to filter out some differences

exception pylookyloo.PyLookylooError

Lookyloo

class pylookyloo.Lookyloo(root_url: str = 'https://lookyloo.circl.lu/', useragent: str | None = None, *, proxies: dict[str, str] | None = None)
compare_captures(capture_left: str, capture_right: str, /, *, compare_settings: CompareSettings | None = None) dict[str, Any]

Compares two captures

Parameters:
  • capture_left – UUID of the capture to compare from

  • capture_right – UUID of the capture to compare to

  • compare_settings – The settings for the comparison itself (what to ignore without marking the captures as different)

enqueue(url: str | None = None, quiet: bool = False, document: Path | BytesIO | None = None, document_name: str | None = None, **kwargs) str

Enqueue an URL.

Parameters:
  • url – URL to enqueue

  • quiet – Returns the UUID only, instead of the whole URL

  • document – A document to submit to Lookyloo. It can be anything suported by a browser.

  • document_name – The name of the document (only if you passed a pseudofile).

  • kwargs – accepts all the parameters supported by Lookyloo.capture

get_apikey(username: str, password: str) dict[str, str]

Get the API key for the given user.

get_capture_stats(tree_uuid: str) dict[str, Any]

Get statistics of the capture

get_comparables(tree_uuid: str) dict[str, Any]

Get comparable information from the capture

get_complete_capture(capture_uuid: str) BytesIO

Returns a zip files that contains the screenshot, the har, the rendered HTML, and the cookies.

Parameters:

capture_uuid – UUID of the capture

get_cookies(capture_uuid: str) list[dict[str, str]]

Returns the complete cookies jar.

Parameters:

capture_uuid – UUID of the capture

get_data(capture_uuid: str) BytesIO

Returns the downloaded data.

Parameters:

capture_uuid – UUID of the capture

get_hash_occurrences(h: str) dict[str, Any]

Returns the base 64 body related the the hash, and a list of all the captures containing that hash.

Parameters:

h – sha512 to search

get_hashes(capture_uuid: str, algorithm: str = 'sha512', hashes_only: bool = True) StringIO

Returns all the hashes of all the bodies (including the embedded contents)

Parameters:
  • capture_uuid – UUID of the capture

  • algorithm – The algorithm of the hashes

  • hashes_only – If False, will also return the URLs related to the hashes

get_hostname_occurrences(hostname: str, with_urls_occurrences: bool = False, limit: int = 20, cached_captures_only: bool = True) dict[str, Any]

Returns all the captures contining the hostname. It will be pretty slow on very common domains.

Parameters:
  • hostname – Hostname to lookup

  • with_urls_occurrences – If true, add details about the related URLs.

  • limit – The max amount of entries to return.

  • cached_captures_only – If False, Lookyloo will attempt to re-cache the missing captures. It might take some time.

get_hostnames(capture_uuid: str) dict[str, Any]

Returns all the hostnames seen during the capture.

Parameters:

capture_uuid – UUID of the capture

get_html(capture_uuid: str) StringIO

Returns the rendered HTML as it would be in the browser after the page loaded.

Parameters:

capture_uuid – UUID of the capture

get_info(tree_uuid: str) dict[str, Any]

Get information about the capture (url, timestamp, user agent)

get_modules_responses(tree_uuid: str) dict[str, Any]

Returns information from the 3rd party modules

Parameters:

capture_uuid – UUID of the capture

get_redirects(capture_uuid: str) dict[str, Any]

Returns the initial redirects.

Parameters:

capture_uuid – UUID of the capture

get_screenshot(capture_uuid: str) BytesIO

Returns the screenshot.

Parameters:

capture_uuid – UUID of the capture

get_stats() dict[str, Any]

Returns all the captures contining the URL

get_status(tree_uuid: str) dict[str, Any]

Get the status of a capture: * -1: Unknown capture. * 0: The capture is queued up but not processed yet. * 1: The capture is ready. * 2: The capture is ongoing and will be ready soon.

get_takedown_information(capture_uuid: str, filter_contacts: Literal[True]) list[str]
get_takedown_information(capture_uuid: str, filter_contacts: Literal[False] = False) list[dict[str, Any]]

Returns information required to request a takedown for a capture

Parameters:
  • capture_uuid – UUID of the capture

  • filter_contacts – If True, will only return the contact emails and filter out the invalid ones.

get_url_occurrences(url: str, limit: int = 20, cached_captures_only: bool = True) dict[str, Any]

Returns all the captures contining the URL

Parameters:
  • url – URL to lookup

  • limit – The max amount of entries to return.

  • cached_captures_only – If False, Lookyloo will attempt to re-cache the missing captures. It might take some time.

get_urls(capture_uuid: str) dict[str, Any]

Returns all the URLs seen during the capture.

Parameters:

capture_uuid – UUID of the capture

hide_capture(tree_uuid: str) dict[str, str]

Hide a capture from the index page (requires an authenticated user, use init_apikey first)

init_apikey(username: str | None = None, password: str | None = None, apikey: str | None = None) None

Init the API key for the current session. All the requests against lookyloo after this call will be authenticated.

property is_up: bool

Test if the given instance is accessible

misp_export(tree_uuid: str) dict[str, Any]

Export the capture in MISP format

misp_push(tree_uuid: str) dict[str, Any] | list[dict[str, Any]]

Push the capture to a pre-configured MISP instance (requires an authenticated user, use init_apikey first) Note: if the response is a dict, it is an error mesage. If it is a list, it’s a list of MISP event.

rebuild_capture(tree_uuid: str) dict[str, str]

Force rebuild a capture (requires an authenticated user, use init_apikey first)

send_mail(tree_uuid: str, email: str = '', comment: str | None = None) bool | dict[str, Any]

Reports a capture by sending an email to the investigation team

Parameters:
  • tree_uuid – UUID of the capture

  • email – Email of the reporter, used by the analyst to get in touch

  • comment – Description of the URL, will be given to the analyst

submit(*, quiet: bool = False, capture_settings: CaptureSettings | None = None) str
submit(*, quiet: bool = False, url: str | None = None, document_name: str | None = None, document: Path | BytesIO | None = None, browser: str | None = None, device_name: str | None = None, user_agent: str | None = None, proxy: str | dict[str, str] | None = None, general_timeout_in_sec: int | None = None, cookies: list[dict[str, Any]] | None = None, headers: str | dict[str, str] | None = None, http_credentials: dict[str, int] | None = None, geolocation: dict[str, float] | None = None, timezone_id: str | None = None, locale: str | None = None, color_scheme: str | None = None, viewport: dict[str, int] | None = None, referer: str | None = None, listing: bool | None = None, auto_report: bool | dict[str, str] | None = None) str

Submit a URL to a lookyloo instance.

Parameters:
  • quiet – Returns the UUID only, instead of the whole URL

  • capture_settings – Settings as a dictionary. It overwrites all other parmeters.

  • url – URL to capture (incompatible with document and document_name)

  • document_name – Filename of the document to capture (required if document is used)

  • document – Document to capture itself (requires a document_name)

  • browser – The browser to use for the capture, must be something Playwright knows

  • device_name – The name of the device, must be something Playwright knows

  • user_agent – The user agent the browser will use for the capture

  • proxy – SOCKS5 proxy to use for capturing

  • general_timeout_in_sec – The capture will raise a timeout it it takes more than that time

  • cookies – A list of cookies

  • headers – The headers to pass to the capture

  • http_credentials – HTTP Credentials to pass to the capture

  • geolocation – The geolocation of the browser latitude/longitude

  • timezone_id – The timezone, warning, it m ust be a valid timezone (continent/city)

  • locale – The locale of the browser

  • color_scheme – The prefered color scheme of the browser (light or dark)

  • viewport – The viewport of the browser used for capturing

  • referer – The referer URL for the capture

  • listing – If False, the capture will be not be on the publicly accessible index page of lookyloo

  • auto_report

    If set, the capture will automatically be forwarded to an analyst (if the instance is configured this way) Pass True if you want to autoreport without any setting, or a dictionary with two keys:

    • email (required): the email of the submitter, so the analyst to get in touch

    • comment (optional): a comment about the capture to help the analyst

trigger_modules(tree_uuid: str, force: bool = False) dict[str, Any]

Trigger all the available 3rd party modules on the given capture. :param force: Trigger the modules even if they were already triggered today.