Get started

The Safe-Text API provides programmatic access to clean short and long texts. It is probably world most complete text-cleaner API! Providing very fast and accurate results (based on machine learning and expert made algorithms)

To use this API, you need an API key via the RapidAPI platform and through our API on Safe-Text . Please contact us by email for any help.

Note that Safe-Text API is a gigantic wrapper on top of other bright solutions; We have glued many other bright libraries to work in one single API. It is a complex solution technically as many technologies (Python, Rust, Wasm, Node ...) are all working together in one single API call.

Note that Safe-Text updates its models regularly to improve accuracy.

Cleansing API

Use the Cleansing API to make cleansings operations for supported languages :
/api/meta To get current information about the API
/api/clean_text?text=hello%20world&models=Punctuate' To run the 'Punctuate' model

Supported languages: en

Note: All models are optional so you can choose all or some of them.
The order of execution is always the same though (at least in the current version). Play with swagger API 🔗

          
            # Here is a curl example
            # '\' is specific to posix shells (replace with ^ for Windows)
            # Out input here is 'some emojis are funny (ง'⌣')ง but not all of them'

            curl -X 'GET'   'http://localhost:3000/api/clean_text?text=some%20emojis%20are%20funny%20(%C3%A0%C2%B8%E2%80%A1'\''%C3%A2%C5%92%C2%A3'\'')%C3%A0%C2%B8%E2%80%A1%20but%20not%20all%20of%20them&models=FixMojibak,Punctuate' \
              -H 'accept: application/json'
          
        

          
            Result example :

            {
              "call_response": {
                "models_applied": [
                  "FixMojibak",
                  "Punctuate"
                ],
                "input": {
                  "text": "some emojis are funny (ง'⌣')ง but not all of them"
                },
                "result": {
                  "clean": "some emojis are funny (v')')v, but not all of them.",
                  "stripped": "some emojis are funny (v')')v, but not all of them."
                },
                "input_hash": "cb5eeea278c1466a527a440b7b18bb8f40670972"
              },
              "call_id": "DfWNbaVFv2",
              "api_name": "cleanText",
              "api_version": "0.0.1",
              "call_time": "2023-05-13T09:14:26.120Z"
            }
          
        

Cleansing query parameters

Field Type Description
text string The text to be cleand
models Subset of ["FixHTML", "Linkify", "FixMojibak", "Punctuate", "Decancer", "BadWords", "StripTags", "DetectLanguage"] Models to apply

Models

  • FixHTML
  • We use Tidy HTML for correcting invalid HyperText Markup Language, detecting potential web accessibility errors and for improving the layout and indent style of the resulting markup.

  • DOMPurify
  • DOMPurify sanitizes HTML and prevents XSS attacks.

  • Linkify
  • Linkify finds links in plain-text and converting them to HTML <a> tags.

  • FixMojibak
  • We use ftfy under the hood. ftfy fixes Unicode that’s broken in various ways. The goal of ftfy is to take in bad Unicode and output good Unicode, for use in your Unicode-aware code.

  • Punctuate
  • Punctuate is another intelligent ML model to fix punctuation in English texts.

  • Decancer
  • Decancer removes common unicode confusables/homoglyphs from strings.

  • BadWords
  • Remove bad (profanity) english words.

  • StripTags
  • Completely strip XML/HTML tags from text.

  • DetectLanguage
  • Detect the language of the text.


Errors

Error code Meaning
404 Endpoint not found error: Wrong route
400 Validation error: Wrong parameters (see above)
502 Bad gateway: Server is completely out of service