Unicode chat

Let's get comfy and talk about Unicode.
β˜ƒ.net


According to
cpansearch.perl.org/src/SBURKE/Text-Unidecode-0.04/lib/Text/Unidecode.pm
the early CDs from the Unicode Consortium came with this song.
youtube.com/watch?v=hH2oAeAwigo


Buzzfeed broke the story a while back that says thanks to Apple you aren't getting a rifle emoji.
buzzfeed.com/charliewarzel/thanks-to-apples-influence-youre-not-getting-a-rifle-emoji
While technically true, it doesn't tell you everything. It was originally for a Unicode 9.0 pack of icons about Olympic games. As Buzzfeed says the decision was unanimous to remove it as an emoji.

Rifle actually made it into Unicode 9.0, just not as an emoji. It is in
Supplemental Symbols and Pictographs
U+1F946 RIFLE πŸ₯†
There is nothing stopping anyone from using it as an emoticon/emoji except lack of font support for that block. It was a minor debate about proper placement of the symbol in which block.


If you didn't already know Unicode 9.0 was just released only a few days ago. BabelStone which has a Unicode blog and several fonts for ancient scripts talked about what's new.
babelstone.co.uk/Blog/2016/01/whats-new-in-unicode-90.html


Still have limited Unicode font support? Google is attempting to create a set of fonts that will cover all of Unicode. They replaced the Droid font set and are called Noto.
google.com/get/noto/
Not enough? Try Wikipedia's Unicode font list for large mega fonts.
en.wikipedia.org/wiki/Unicode_font

Other urls found in this thread:

unicode.org/L2/L2016/16103-jurassic-fdbk.pdf
en.wikipedia.org/wiki/ConScript_Unicode_Registry
twitter.com/NSFWRedditVideo

symbola.ttf master race

I, for one, am still livid about the Unicode Consortiums negligently and, frankly, amateurish approach to dinosaur emojis.

unicode.org/L2/L2016/16103-jurassic-fdbk.pdf

I don't know much about dinosaurs so I don't know if you are joking, but I have seen proposals by Everson where he is isn't sure and it sounds like he is guessing. It wouldn't come as a shock to me to hear most of them have little understanding of what they are asked to encode.


What about Quivara?

The Story of Unicode

A: we need more than 16 bits so that we can truly be universal, and include Chinese.

B: uh, but the Chinese already have 16-bit fixed width encodings that work well for them, that are both simpler and more compact than your Chinese proposals.

A: Racist!

B: OK, fine, whatever.

A: adds upside-down text, Jewish plusses, ellipses, smileys, original pictographs of animals, left-wing political iconography, cancer, and more!

B: ... Well at least I can sort of pretend that this swastika is the Nazi one.

While it had good intentions a while back and ascii wasn't enough, it grew like cancer. And here we are with unicode in urls, cyrillic letters that look almost (and in some cases exactly) identical to roman letters, a greek question mark that looks like the latin semicolon ";" and much much more. Did they really have to go overboard?

β˜ƒ

Let's not forget the emoji movie.
Cancer.

let's also not forget cripplekike abandonded this site to make an emoji programming language

That is a valid criticism of Unicode. Another point to add is they are inconsistent. In some cases you get identical characters just in a different script as you mention Cyrillic and Latin are great examples. In other cases there have been requests for special characters. Unicode told them they should use Latin character with a combining character to get what they want. In other cases they actually have added Latin characters that could have been created with a combining character.

Many people agree with your other point that they went overboard. There was sever criticism years ago when ancient scripts and pictures (wingdings and control boxes and such) were proposed. Some only wanted Unicode to be about current living languages, not an encoding of every language that has ever existed.

Summer
u
m
m
e
r

Also a reminder Commodore did special dingbat characters in their text encoding first

Modern reimplementation of PETSCII when?

Unicode became cancer when they adopted Emoji. Unicode was meant to make it possible to exchange text using all known languages. Fair enough. They also added dead scripts, like runes, but that's still OK because they can be useful for transcribing old documents. Then they added made-up languages like Elvish and Klingon. That's already going too far as I'm concerned, but at least those are still glyphs. Then came Emoji, but at least that was just adopting an existing set. But now they keep inventing new icons, and because Emoji are ideographs there will never be an end to this. There are two police car emoji from different perspective for fuck's sake. There is three bicycles, one without person, one with a person and one with a person in front of a mountain backdrop (not even counting all the skin color variations). Why don't we add one with a forest, or beach, or city, or countryside backdrop as well? There is no unicycle emoji though, so repeat all that for unicycles as well. There is a fucking carousel horse emoji. When on earth am I ever going to need that? They also fuck up the height of a line when use inline with regular text.
🚲 🚴 🚡

The only good use for Emoji I have ever found is as error symbols in the gutter area of Vim for syntax checkers and linters. Vim can't display images, and even if it could it would be of no use when I'm using Neovim from the terminal, but since Emoji are text it can display those.
πŸ’‘πŸ”΅β­•οΈπŸš«βš οΈβŒβ‰οΈπŸ’© (why the fuck does a turd with eyes even exist?)


To be fair, separate characters for Cyrillic make sense because they are different letters. A latin Y and a cyrillic Π£ (a U character) can be rendered differently or the same by a font, the decision should be up to the font, Unicode is just about semantics. But it does create problems in that one could mask a phishing URL by replacing similar looking characters. Does your font show a different between Trust.net and Π’rust.nΠ΅t?

people would try to eat it if it wasn't anthropomorphised

Well it's not going to go away so might as well make use of it and use it for text-based interface icons.

I do not click links and just answer to OP, as I have a toaster.

While I agree with your main point I have to point out this is not true. They did add the Mormon Deseret, but that was actually used in history. Klingon and other fantasy scripts are in a private use constructed language database. They register their own blocks and operate it like the consortium, but it is not official.
en.wikipedia.org/wiki/ConScript_Unicode_Registry

But user, all the usefulness of those symbols could be created with regular text symbols and terminal colors
OØ!X!?

fucking
REEEEEEEEEEEEEEEEEEEEE

I know, but Emoji stick out better, which is a plus in case of error symbols in my opinion. Don't get me wrong, I would prefer we didn't have Emoji in the first place, but since we have them might as well use them. I was using Unicode characters before, but I found the Emoji to do a better job.

Best encoding UTF-32 master race.

UCS-4 is a terrible interchange format for the same reason UCS-2 was before it became UTF-16 and had a whole bunch of other reasons added on top: it requires all parties to explicitly support it, regardless of whether they interpret the data or not, - it's not self synchronizing and it's endian dependent.
By all means use it as an internal representation if it makes sense in your program but anything coming into or out of your program had better be UTF-8.

chromatic font support in freetype when