Before posting keep this in mind; This website is supported by user created content to keep it active.
If users (such as yourself) do not create content, you only have yourselves to blame for the lack of said content.


[Return]
Posting mode: Reply
Name
Email
Subject   (reply to 6814)
Message
BB Code
File
File URL
Embed   Help
Password  (for post and file deletion)
  • Supported file types are: ASS, BMP, CSS, FLAC, GIF, JPEG, JPG, MP3, OGG, PDF, PNG, PSD, RAR, SWF, TORRENT, TXT, ZIP
  • Maximum file size allowed is 10000 KB.
  • Images greater than 260x260 pixels will be thumbnailed.
  • Currently 1570 unique user posts.
  • board catalog

File 163412698772.png - (109.77KB , 1158x263 , 2021-10-13.png )
6814 No. 6814 [Edit]
I'm not sure what happened but the Jp text is broken on the AH thread. It was fine for the longest time. Can this be fixed? Please, this is important to me.
Thanks for your help.
Expand all images
>> No. 6815 [Edit]
File 163413245358.png - (2.54MB , 1920x1080 , mei.png )
6815
I'm not sure either how and why that happened. I'm afraid that I'm not able to fix it.

Curiously in few threads, Japanese text looks corrupted when seen from the board index, but are shown properly in the thread itself.
>> No. 6816 [Edit]
What a pity. I thought this would be a simple fix. Can you edit the text, maybe? I know what's suppose to be written there.
"Uploaded is a page from うつうつひでお日記." and the last sentence is 吾妻ひでお先生、本当にありがとうございました。

This info probably won't help but the Japanese text broke as soon as I posted there yesterday. I copied the last sentence from the OP, uploaded an img and posted. I noticed it had jumbled the Jp text but assumed it was my browser during reload or something like that and went to sleep. Unfortunately it's not the case.

Well, thank you for looking into it anyways.
>> No. 6817 [Edit]
>>6816
>Can you edit the text, maybe?
Unfortunately I cannot with my capabilities. One possibility is editing directly from the database, but it might be too invasive and I don't have access to it anyway.
>> No. 6818 [Edit]
>>6817
It's alright. Thanks for caring.
>> No. 6819 [Edit]
It's fixed! How did you do it?
Thank you!
>> No. 6820 [Edit]
File 163493345956.png - (2.20MB , 1920x1080 , isuzu5.png )
6820
>>6819
Simply by editing the database. I did say that I can't access it but guess, I was wrong. I also had an opportunity to test code someone wrote (>>/navi/2440).

There's plans to fix other garbled posts, but I'm not yet sure how I'm going to do it without going through every post by hand. Don't expect the fix coming anytime soon though.
>> No. 6821 [Edit]
>>6820
>but I'm not yet sure how I'm going to do it without going through every post by hand
You could use the ftfy python package in that thread you linked to go through and detect corrupted posts (pre-filter on text with non-ascii characters, and see if fixed text differs from original. There might be a few false positives but you could review those manually)
>> No. 6823 [Edit]
It's not just JP, it's anything non-ascii it seems. E.g. this post which used the accented a and emdash:

http://tohno-chan.com/an/res/34782.html#i34820

I don't know what caused this on the backend, whether it was a db migration or just some serving-side change. Hopefully it's the latter, in which case I think you just need to set content type to utf8 and let the browser take care of it? But then again new posts seem to work fine, so maybe it's the former. Idk hopefully one of the admins can chime in and we can come up with a solution
>> No. 6829 [Edit]
Found this amazing automated mojibake fixer: http://www.linestarve.com/tools/mojibake/

Which tells us this was the transformation:

>encode (string→bytes) as sloppy-windows-1252
>decode (bytes→string) as utf-8
>apply (string→string) fix_character_width


Looks like python's ftfy package can do this too, so you could just run it across the entire db.
>> No. 6830 [Edit]
>>6829
>fix_character_width
>Replace fullwidth Latin characters and halfwidth Katakana with their more standard widths.
I'd recommend against doing this. It would butcher any キタ━━━(゚∀゚)━━━!! posts and SJIS art and the like.
>> No. 6832 [Edit]
>>6815
>Curiously in few threads, Japanese text looks corrupted when seen from the board index, but are shown properly in the thread itself.
This year's annual /txt/ post >>/txt/276 made a few hours ago appears to have caused this to occur there too. The board index looked fine shortly before that post was made.
>> No. 6833 [Edit]
File 165092241791.png - (2.60MB , 1920x1080 , ren2.png )
6833
>>6829
I almost forgot about this. Thanks for reminding me.

It seemed simple to fix, but the problem might be more complicated than thought.
Namely, the encoding settings of MySQL may or may not be all fucked up. We don't exactly know what could happen when we import the fixed table back.

We're planning to do some testing in the near future.
>> No. 6834 [Edit]
File 165112410758.png - (1.25MB , 1280x960 , mizuka19.png )
6834
It should be fixed now. Some threads need "refreshing" but I can assure it is fixed on the backbone.
>> No. 6836 [Edit]
>>6834
Yup, it seems to be fine now. Thanks for the fix, admin-sama.

View catalog

Delete post []
Password  
Report post
Reason  


[Home] [Manage]



[ Rules ] [ an / foe / ma / mp3 / vg / vn ] [ cr / fig / navi ] [ mai / ot / so / tat ] [ arc / ddl / irc / lol / ns / pic ] [ home ]