Skip to content

Instantly share code, notes, and snippets.

@DavidBuchanan314
Last active December 31, 2024 18:58
Show Gist options
  • Save DavidBuchanan314/aafce6ba7fc49b19206bd2ad357e47fa to your computer and use it in GitHub Desktop.
Save DavidBuchanan314/aafce6ba7fc49b19206bd2ad357e47fa to your computer and use it in GitHub Desktop.
Rabbit R1 Unofficial API Docs

The Rabbit R1 uses a few custom APIs to talk to The Cloud™. Almost nothing happens on-device, and all the AI magic happens on servers.

Consequently, you don't really need the physical device.

TLS Client Fingerprinting

In lieu of an authentication scheme, Rabbit's servers attempt to verify device authenticity by checking the TLS client's JA3 fingerprint, presumably enforced by AWS WAF.

If your TLS client doesn't match an expected fingerprint, you'll get HTTP 403 errors. This fingerprint works:

771,4865-4866-4867-49195-49196-52393-49199-49200-52392-49171-49172-156-157-47-53,0-23-65281-10-11-35-16-5-13-51-45-43-21,29-23-24,0

This is a common fingerprint for Android devices and is not exclusive to the R1.

I use utls to replicate this fingerprint like so (drop this into ja3proxy)

Account Activation

Visit https://rabbit.tech/activate in a web browser and set up an account. Follow the registration process, and you should end up with a QR code.

Decode the QR. I like to use zxing. You should get a URL that looks like:

https://hole.rabbit.tech/apis/linkDevice?userId=auth0%7Crandomhex&linkingPasscode=randomhex

The URL must first be modified to append a deviceId parameter, set to a 15 digit decimal number (it's supposed to be an IMEI, but you can use any value) (alternatively, generate a more realistic value with this)

Make an HTTP GET request to the modified URL, and on success you'll get a JSON response that looks like this:

{"actualUserId":"auth0|randomhex","userId":"randomhex","accountKey":"randomhex","userName":"blah"}

Keep these values safe, particularly accountKey, you'll need them for later.

The account is now activated, and your browser session should have access to the "rabbithole", where you can let them skim your creds for 3rd party service integrations over VNC, and other such features.

The API

The main API is a JSON-based RPC-like mechanism running over a websocket, at wss://r1-api.rabbit.tech/session

The API is clearly based on the GAMA NPC "Quantum Engine AI" integration thing, which you can find partial docs for here (paste it into https://studio.asyncapi.com/), but this is more of a curiosity than useful documentation.

You'll need to set a couple of HTTP headers before it'll work, App-Version and OS-Version. Valid values for these fields change in each update, so I won't list them here, but maybe someone will be nice and leave currently-working values in the comments. (it sounds like OS-Version is the more important of the two, App-Version maybe doesn't matter)

Device-Health (UNTESTED)

In newer updates (v0.8.99+) a timestamp string in the format rabbit_OS_v0.8.99_20240606175556,YYYYMMDDHHMMSSmmm,xx (where mmm is milliseconds, xx is a random 2-digit even integer) is encrypted with the following RSA-3072 public key:

-----BEGIN RSA PUBLIC KEY-----
MIIBigKCAYEAqLNRPcujKw1elkNJc+10o37YVbb7OjYa4Cv2pG2BzfSV3Ev7LMva
A2w0PAy25DhQU2NI7RU2a51OvTz0DsXM69oakuN0oSrKa9Eit2GPnX89H702MXGX
iRDZWEufAx67AaxK9d80Bajh2Abn06Bwaz9Z4D8vMxUOGsYkVKMW0LrmnW4984XI
UqT3+lOiEijBamodU/mORTeuxc5cdan00fq8qTOYuGFuKlPJSI3EExFHP3ONHD6z
44+PxXmhw532uAiNnT74yKXBoVYU19b8AAWLiSKyjf1eeus7dTobPKcpMemlJgxH
tVHtaSgnUugQ0a3XvmTVQpSeytPw8bL+/3c5KXfjGxPchoEZi7d71wv/AufDiSXr
gaew1KfJZBsr8Somr03b8xsHRJruPT61iPceh9bTWscwnK3WmDpAxnjdPQiflt/m
KkPEETtKGx0X5kUImHnr1jhUdYKmEOHfwkXBKVc66hpn85WGJ7MPVyixIOpzScAY
nKjVsP4ma6iFAgMBAAE=
-----END RSA PUBLIC KEY-----

(nb: This key changed in v0.8.107)

in RSA_PKCS1_OAEP_PADDING mode (MGF1, SHA1). The resulting value is base64 encoded and stored in the Device-Health header. It's unclear how this measures the health of a device, but it's a feature nonetheless.

Example code

I haven't thoroughly tested this yet. At present, the API doesn't seem to mind whether Device-Health is correct, or specified at all.

Authentication

To authenticate, send a JSON blob that looks like this:

{
	"global": {
		"initialize": {
			"deviceId": IMEI,
			"evaluate": false,
			"greet": true,
			"language": "en",
			"listening": true,
			"location": {
				"latitude": 0.0,
				"longitude": 0.0
			},
			"mimeType": "wav",
			"timeZone": "GMT",
			"token": "rabbit-account-key+" + ACCOUNT_KEY,
		}
	}
}

deviceId is whatever you used during activation, and ACCOUNT_KEY is the value of accountKey from the activation response message. Use your imagination for the other fields (I haven't figured out precisely what "listening", "greet" or "evaluate" do yet).

This should be the first thing you send after initiating the websocket connection.

Send "Terminal" Text Input

Send a JSON message that looks like this

{
  "kernel": {
    "userText": {
      "text": INPUT
    }
  }
}

Receive "Terminal" Text Output

Text-based responses look like this:

{"kernel": {"assistantResponse": OUTPUT}}

Receive Voice Output

Example output:

{
  "kernel": {
    "assistantResponseDevice": {
      "text": {
        "language":"en",
        "chars":[" ","H","e","l","l","o",","," ","h","o","w"," ","c","a","n"," ","I"," ","a","s","s","i","s","t"," ","y","o","u"," ","t","o","d","a","y","?"],
        "char_start_times_ms":[0,...],
        "char_durations_ms":[0,...]
      },
      "audio": BASE64_WAV,
      "canned": false,
    }
  }
}

NOTE: The text field is actually a stringified JSON object, I'm showing it as plain JSON above for clarity.

I wonder what the canned field indicates?

Set Push-to-Talk State

Send a JSON message like this:

{
  "kernel": {
    "voiceActivity": {
      "imageBase64": "",
      "state": STATE
    }
  }
}

Where STATE is one of: inactive, pttButtonPressed, pttButtonReleased.

Streaming Voice Input

Set PTT state to pressed, then send 0.1 second chunks of uncompressed WAV as bytes directly down the websocket, then set PTT state to released. It looks like it uses 16kHz stereo, 16-bit samples.

Image Input (UNTESTED)

Send a base64-data-uri-encoded JPEG file, nominally 1080x720px at 100% quality (although other resolutions/qualities/formats presumably work too?) in a pttButtonReleased PTT message's imageBase64 field. (sent along with a voice input as described above)

{
  "kernel": {
    "voiceActivity": {
      "imageBase64": "",
      "state": "pttButtonReleased"
    }
  }
}
@Pinball3D
Copy link

Holy f*** this is amazing

@Pinball3D
Copy link

Pinball3D commented May 28, 2024

Hey, if you see this, how do i setup the proxy? Im on macos. I replaced the proxy.go file with the one you provided and ran make, then ./ja3proxy, and it says this: "2024/05/27 19:55:15 copy client to dest error: read tcp 192.168.1.52:52301->18.160.41.19:443: use of closed network connection"

EDIT: Got it working, authenticated my "r1" now just need to get the wss working

@Proton0
Copy link

Proton0 commented Jun 1, 2024

@Pinball3D how did you manage to fix it? I am on MacOS and same issue

@Pinball3D
Copy link

@Pinball3D how did you manage to fix it? I am on MacOS and same issue

I’m not entirely sure how I fixed it to be honest. I think I might have rebuilt the executable or something. I will look when I am on my computer later.

@DarthChief394
Copy link

Does anyone managed to authenticate? I can't figure out how to send the JSON blob

@CodeMusic
Copy link

Wait, so you can make a python script on your computer and periodically throughout the day a different times prompt the rabbit to say something even if the rabbit is in a different area, assuming both devices are connected to the Internet?

@CodeMusic
Copy link

Is there a way to unlink factory reset delete account, so that you can proceed with the activation flow that you provided… alternatively is there a way to get that payload data post activation…eg carroot Terminal access?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment