Why I hit 300 graphemes limit? #434
-
Software
IssueWhen attempting to send 2 (different) posts of mine (containing hashtags, links incl. external embed) using the Response(success=False, status_code=400, content=XrpcError(error='InvalidRequest', message='Invalid app.bsky.feed.post record: Record/text must not be longer than 300 graphemes'), headers={'x-powered-by': 'Express', 'access-control-allow-origin': '*', 'cache-control': 'private', 'vary': 'Authorization, Accept-Encoding', 'ratelimit-limit': '5000', 'ratelimit-remaining': '4997', 'ratelimit-reset': '1731571232', 'ratelimit-policy': '5000;w=3600', 'content-type': 'application/json; charset=utf-8', 'content-length': '123', 'etag': 'W/"7b-5OYjOucrkx456vG6nIBiH/VZWIA"', 'date': 'Thu, 14 Nov 2024 07:00:32 GMT', 'keep-alive': 'timeout=90', 'strict-transport-security': 'max-age=63072000'}) Response(success=False, status_code=400, content=XrpcError(error='InvalidRequest', message='Invalid app.bsky.feed.post record: Record/text must not be longer than 300 graphemes'), headers={'x-powered-by': 'Express', 'access-control-allow-origin': '*', 'cache-control': 'private', 'vary': 'Authorization, Accept-Encoding', 'ratelimit-limit': '5000', 'ratelimit-remaining': '4991', 'ratelimit-reset': '1731571232', 'ratelimit-policy': '5000;w=3600', 'content-type': 'application/json; charset=utf-8', 'content-length': '123', 'etag': 'W/"7b-5OYjOucrkx456vG6nIBiH/VZWIA"', 'date': 'Thu, 14 Nov 2024 07:00:44 GMT', 'keep-alive': 'timeout=90', 'strict-transport-security': 'max-age=63072000'}) My code however, which calculates the length of the post to be sent to Bluesky mimicking the same way the web interface would've counted it based on my own test composing similar type of posts have failed to determine that it is exceeding the As confirmation, I took the exact contents of the 2 posts and put them right into Bluesky web interface's post "composer" and both of them returned the lengths, Is this a known issue, and is there anything we could do to try and fix this? Please let me know if you require any further information. Thank you. I'll add here possibly related issues from this/other related projects I've found: |
Beta Was this translation helpful? Give feedback.
Replies: 8 comments 2 replies
-
I think if I'm not mistaken that this error is a server-side error, meaning the Python library tried to send your post but it got rejected by the Bluesky server. It would be helpful to also see what the content of the post you were trying to send was, and also the code you were trying to send it with. Did the post include any other facets like embeds or links? How many of them? It might be that the way you constructed the post in Python was different to what you then tried in the app. I haven't personally ran into issues with post length when sending posts with the Python SDK, so I would assume it's an issue with how you were writing the post up in Python. |
Beta Was this translation helpful? Give feedback.
-
Bsky uses two limits: length, graphemes. Python SDK validates locally only length for now. So it will not prevent request to being sending. As @emilyhunt said it is server side error where server`s graphemes validator rejects the request. Let's take a look on post`s text limits (according to lexicon): max len is 3000 and max graphemes is 300. And the last one, to understand these limits, we need to understand what is graphemes are. Graphemes is like symbol, but a visual one for the user. For example emoji. Emoji could take few bits in unicode, this is len >1, but this is only 1 grapheme. So, pls check what are you trying to post. Maybe a lot of emojis, different unicode symbols, etc. |
Beta Was this translation helpful? Give feedback.
-
Thank you for the feedbacks! For better context, here's the general flow of my code before it gets sent:
Here's a sample flow of the code: if __name__ == "__main__":
# test variables
handle = "handle.example.com" # example only
access_token = "OIZJ3aCRTbPeIhcHwn4" # example only
# instantiate bluesky
client = Client()
login = client.login(
handle,
access_token,
)
# count/limit post length (grapheme)
post_title = "Fahmi: Govt not blocking social media platforms that fail to apply for licensing after 1st Jan 2025"
post_tags = ['apps', 'digitallife', 'mcmc', 'news', 'socialmedia', 'socialmedialicence', 'socialmedialicensing', 'socialmediaregulation']
post_link = "https://soyacincau.com/2024/11/11/govt-not-blocking-social-media-platforms-fail-to-apply-for-licensing/"
bluesky_post = validate_post_length(post_title, post_tags, post_link)
## bluesky_post = "Fahmi: Govt not blocking social media platforms that fail to apply for licensing after 1st Jan 2025 #apps #digitallife #mcmc #news #socialmedia #socialmedialicence #socialmedialicensing #socialmediaregulation\n\nhttps://soyacincau.com/2024/11/11/govt-not-blocking-social-media-platforms-fail-to-apply-for-licensing"
# make post rich
bluesky_post, link_embed = build_rich_post(client, bluesky_post)
## vars(bluesky_post) = {'_buffer': <_io.BytesIO object at 0x7ab718cfbf60>, '_facets': [Main(features=[Tag(tag='apps', py_type='app.bsky.richtext.facet#tag')], index=ByteSlice(byte_end=105, byte_start=100, py_type='app.bsky.richtext.facet#byteSlice'), py_type='app.bsky.richtext.facet'), Main(features=[Tag(tag='digitallife', py_type='app.bsky.richtext.facet#tag')], index=ByteSlice(byte_end=118, byte_start=106, py_type='app.bsky.richtext.facet#byteSlice'), py_type='app.bsky.richtext.facet'), Main(features=[Tag(tag='mcmc', py_type='app.bsky.richtext.facet#tag')], index=ByteSlice(byte_end=124, byte_start=119, py_type='app.bsky.richtext.facet#byteSlice'), py_type='app.bsky.richtext.facet'), Main(features=[Tag(tag='news', py_type='app.bsky.richtext.facet#tag')], index=ByteSlice(byte_end=130, byte_start=125, py_type='app.bsky.richtext.facet#byteSlice'), py_type='app.bsky.richtext.facet'), Main(features=[Tag(tag='socialmedia', py_type='app.bsky.richtext.facet#tag')], index=ByteSlice(byte_end=143, byte_start=131, py_type='app.bsky.richtext.facet#byteSlice'), py_type='app.bsky.richtext.facet'), Main(features=[Tag(tag='socialmedialicence', py_type='app.bsky.richtext.facet#tag')], index=ByteSlice(byte_end=163, byte_start=144, py_type='app.bsky.richtext.facet#byteSlice'), py_type='app.bsky.richtext.facet'), Main(features=[Tag(tag='socialmedialicensing', py_type='app.bsky.richtext.facet#tag')], index=ByteSlice(byte_end=185, byte_start=164, py_type='app.bsky.richtext.facet#byteSlice'), py_type='app.bsky.richtext.facet'), Main(features=[Tag(tag='socialmediaregulation', py_type='app.bsky.richtext.facet#tag')], index=ByteSlice(byte_end=208, byte_start=186, py_type='app.bsky.richtext.facet#byteSlice'), py_type='app.bsky.richtext.facet'), Main(features=[Link(uri='https://soyacincau.com/2024/11/11/govt-not-blocking-social-media-platforms-fail-to-apply-for-licensing', py_type='app.bsky.richtext.facet#link')], index=ByteSlice(byte_end=312, byte_start=210, py_type='app.bsky.richtext.facet#byteSlice'), py_type='app.bsky.richtext.facet')]}
## vars(link_embed) = {'external': External(description='The government has no intention to block internet messagin service and social media platforms that fail to apply for licensing after the 1st of January 2025.', title="Govt won't block platforms without licence after 1st Jan 2025", uri='https://soyacincau.com/2024/11/11/govt-not-blocking-social-media-platforms-fail-to-apply-for-licensing', thumb=BlobRef(mime_type='image/jpeg', size=236560, ref=IpldLink(link='bafkreiggqa543b2at3rd5wsfjmy7ytbilxkfzbzzpzdfrxh2gv6yoakika'), py_type='blob'), py_type='app.bsky.embed.external#external'), 'py_type': 'app.bsky.embed.external'}
# send post call
try:
post_id = send_bluesky_post(
client,
bluesky_post,
link_embed=link_embed
)
except Exception as e:
print(e)
## e = Response(success=False, status_code=400, content=XrpcError(error='InvalidRequest', message='Invalid app.bsky.feed.post record: Record/text must not be longer than 300 graphemes'), headers={'x-powered-by': 'Express', 'access-control-allow-origin': '*', 'cache-control': 'private', 'vary': 'Authorization, Accept-Encoding', 'ratelimit-limit': '5000', 'ratelimit-remaining': '4985', 'ratelimit-reset': '1731634236', 'ratelimit-policy': '5000;w=3600', 'content-type': 'application/json; charset=utf-8', 'content-length': '123', 'etag': 'W/"7b-5OYjOucrkx456vG6nIBiH/VZWIA"', 'date': 'Fri, 15 Nov 2024 01:20:02 GMT', 'keep-alive': 'timeout=90', 'strict-transport-security': 'max-age=63072000'}) and here's the simplified method that sends the post: def send_bluesky_post(client, content, **kwargs):
link_embed = kwargs.get("link_embed", None)
params = kwargs.get("params", {})
post_id = kwargs.get("post_id")
# include link embed object if applicable
params.update(embed=link_embed) if link_embed else None
# send bluesky post
# NOTE: post update not currently supported on bluesky - alternative implementation would be to quote instead
if post_id:
quote_embed = atproto_models.app.bsky.embed.record.Main(
record=atproto_models.ComAtprotoRepoStrongRef.Main(
uri=post_id.split(",")[0],
cid=post_id.split(",")[1]
)
)
params.update(embed=quote_embed) if not link_embed else params.update(embed=atproto_models.AppBskyEmbedRecordWithMedia.Main(record=quote_embed, media=link_embed))
post = client.send_post(text=content, **params)
# return post id
return "%s,%s" % (getattr(post, "uri"), getattr(post, "cid")) For this sample post, composing it on the Bluesky web interface, which gives the exact same resulting post (superficially at least) with the built tags, link, and external embed - the perceived grapheme by the interface is Would appreciate any advice on what I could do better here to handle or work around this. |
Beta Was this translation helpful? Give feedback.
-
Thanks! But it would be helpful to see the source code in |
Beta Was this translation helpful? Give feedback.
-
The final post (text) variable were included in the previous sample of mine, particularly the ones with double leading comments: bluesky_post = validate_post_length(post_title, post_tags, post_link)
## bluesky_post = "Fahmi: Govt not blocking social media platforms that fail to apply for licensing after 1st Jan 2025 #apps #digitallife #mcmc #news #socialmedia #socialmedialicence #socialmedialicensing #socialmediaregulation\n\nhttps://soyacincau.com/2024/11/11/govt-not-blocking-social-media-platforms-fail-to-apply-for-licensing" # make post rich
bluesky_post, link_embed = build_rich_post(client, bluesky_post)
## vars(bluesky_post) = {'_buffer': <_io.BytesIO object at 0x7ab718cfbf60>, '_facets': [Main(features=[Tag(tag='apps', py_type='app.bsky.richtext.facet#tag')], index=ByteSlice(byte_end=105, byte_start=100, py_type='app.bsky.richtext.facet#byteSlice'), py_type='app.bsky.richtext.facet'), Main(features=[Tag(tag='digitallife', py_type='app.bsky.richtext.facet#tag')], index=ByteSlice(byte_end=118, byte_start=106, py_type='app.bsky.richtext.facet#byteSlice'), py_type='app.bsky.richtext.facet'), Main(features=[Tag(tag='mcmc', py_type='app.bsky.richtext.facet#tag')], index=ByteSlice(byte_end=124, byte_start=119, py_type='app.bsky.richtext.facet#byteSlice'), py_type='app.bsky.richtext.facet'), Main(features=[Tag(tag='news', py_type='app.bsky.richtext.facet#tag')], index=ByteSlice(byte_end=130, byte_start=125, py_type='app.bsky.richtext.facet#byteSlice'), py_type='app.bsky.richtext.facet'), Main(features=[Tag(tag='socialmedia', py_type='app.bsky.richtext.facet#tag')], index=ByteSlice(byte_end=143, byte_start=131, py_type='app.bsky.richtext.facet#byteSlice'), py_type='app.bsky.richtext.facet'), Main(features=[Tag(tag='socialmedialicence', py_type='app.bsky.richtext.facet#tag')], index=ByteSlice(byte_end=163, byte_start=144, py_type='app.bsky.richtext.facet#byteSlice'), py_type='app.bsky.richtext.facet'), Main(features=[Tag(tag='socialmedialicensing', py_type='app.bsky.richtext.facet#tag')], index=ByteSlice(byte_end=185, byte_start=164, py_type='app.bsky.richtext.facet#byteSlice'), py_type='app.bsky.richtext.facet'), Main(features=[Tag(tag='socialmediaregulation', py_type='app.bsky.richtext.facet#tag')], index=ByteSlice(byte_end=208, byte_start=186, py_type='app.bsky.richtext.facet#byteSlice'), py_type='app.bsky.richtext.facet'), Main(features=[Link(uri='https://soyacincau.com/2024/11/11/govt-not-blocking-social-media-platforms-fail-to-apply-for-licensing', py_type='app.bsky.richtext.facet#link')], index=ByteSlice(byte_end=312, byte_start=210, py_type='app.bsky.richtext.facet#byteSlice'), py_type='app.bsky.richtext.facet')]}
## vars(link_embed) = {'external': External(description='The government has no intention to block internet messagin service and social media platforms that fail to apply for licensing after the 1st of January 2025.', title="Govt won't block platforms without licence after 1st Jan 2025", uri='https://soyacincau.com/2024/11/11/govt-not-blocking-social-media-platforms-fail-to-apply-for-licensing', thumb=BlobRef(mime_type='image/jpeg', size=236560, ref=IpldLink(link='bafkreiggqa543b2at3rd5wsfjmy7ytbilxkfzbzzpzdfrxh2gv6yoakika'), py_type='blob'), py_type='app.bsky.embed.external#external'), 'py_type': 'app.bsky.embed.external'} but yes, for clarity, here are the simplified versions of those two following methods: build_rich_postdef build_rich_post(client, text):
link_embed = None
rich_post = client_utils.TextBuilder()
# define patterns
hashtag_pattern = r"#\w+"
mention_pattern = r"@\w+"
url_pattern = r"http[s]?://\S+"
# split the text using urls and keep the delimiters
parts = re.split("(%s)" % url_pattern, text)
# build rich post
for part in parts:
# build link
if re.match(url_pattern, part):
rich_post.link(part, part)
# skip creating link embed object if one exists
if link_embed:
continue
# get required metadata
link_metadata = get_content_md(part)
# create link embed object if sufficient metadata
if link_metadata and ((description := link_metadata.get("description")) and (title := link_metadata.get("title"))):
thumbnail_bin = getattr(requests.get(thumbnail), "content", None) if (thumbnail := link_metadata.get("thumbnail")) else None
params = dict(
description=description,
thumb=client.upload_blob(data=thumbnail_bin).blob if thumbnail_bin else None,
title=title,
uri=part,
)
link_embed = atproto_models.AppBskyEmbedExternal.Main(
external=atproto_models.AppBskyEmbedExternal.External(**params)
)
else:
# split by spaces to handle individual words and hashtags - keep the spaces as separate elements
words = re.split(r'(\s+)', part)
for word in words:
# build hashtag
if re.match(hashtag_pattern, word):
rich_post.tag(word, word[1:])
# build mention if valid
elif re.match(mention_pattern, word):
user_did = getattr(get_user(client, word[1:]), "did", None)
rich_post.mention(word, user_did) if user_did else rich_post.text(word)
# add regular text
else:
rich_post.text(word)
return rich_post, link_embed validate_post_lengthdef validate_post_length(title, tags, link):
# set character limits
char_limit = 300
link_limit = sum((23, 2)) # additional 2 for 2 newlines
# count characters
title_count = len(title)
tags_count = len(tags)
link_count = 0 if not link else (link_limit if sum((len(link), 2)) > link_limit else sum((len(link), 2)))
# emoji_count is how many emoji characters are in the post
# emoji_length is the perceived length of all the emojis by python (which is inaccurate when counting graphemes)
emoji_count, emoji_length = count_emoji(title + tags + link)
# prioritise removing tags, then limiting title to accommodate link
if sum((title_count, tags_count, link_count, emoji_count - emoji_length)) > char_limit:
tags = ""
emoji_count = count_emoji(title + tags + link)[0]
title = title[:char_limit - (link_count + emoji_count)]
# return post content
return "{title}{tags}\n\n{link}".format(title=title, tags=tags, link=link) |
Beta Was this translation helpful? Give feedback.
-
I think that link will be included as general limit for graphemes. Try link shorter or add links not as facets, but as embeds instead upd. oh you already attach link as embed
|
Beta Was this translation helpful? Give feedback.
-
Yes, but the link text itself is also included so the post has both the
Yea, I guess that's an option that might help. I just wish there's more clarity on what gets counted because right now I seem to only be able to guess/assume. I was hoping testing with the web interface would return similar results but apparently not since the exact same post would've been sufficiently under Also speaking of links, I'm kinda confused by how it's counted - This Buffer doc is the only place where I could find information on what Bluesky counts as the maximum length of a link, For example, I had 2 links (different domain) that had the exact same number of alphanumerical characters and yet both counted as more than Update: Excluding the (link) facet and only including the |
Beta Was this translation helpful? Give feedback.
-
I'm running into something similar; sure would be great to be able to evaluate a construct for grapheme / length that's better than "smacking it into the BlueSky server and seeing if it accepts it or not, then backing off via this terrible hack." The link, title, and rest of the text is non-deterministic, so being able to adjust on the fly would be super. |
Beta Was this translation helpful? Give feedback.
I think that link will be included as general limit for graphemes. Try link shorter or add links not as facets, but as embeds instead
upd. oh you already attach link as embed
upd2. looks like you are placing links to facets as well. try only as embed