Skip to main content

Questions tagged [unicode]

Unicode is the standard for computer representation of plain text. It encompasses the Universal Character Set, intended to unambiguously represent all characters used in human writing systems in any language, Unicode Transformation Formats (UTFs), defining standardized formats for storing and transmitting Unicode text, and standards for processing and manipulating text.

Filter by
Sorted by
Tagged with
8 votes
5 answers

What is the name of this character that looks like an upside down arrowhead?

It came to my attention that the term "caret" refers specifically to the inversely-oriented variant of this symbol, i.e., upward pointing. Oddly enough, searching caret didn't readily reveal ...
Arctiic's user avatar
  • 1,183
2 votes
2 answers

Mixed UTF-16BE strings with ANSI in content stream of a PDF page object

I am working on generation of PDF on the fly tool, with custom chosen local font. Embeeding the font is not a problem, and been proven that font been found w/o problem. However, the font is Chinese, ...
Xerix's user avatar
  • 135
0 votes
1 answer

UTF-8 Decoders fail to decode the encoded strings

I have some encoded values values which I believe is UTF-8. Now I dont really know if it is UTF-8 or not because other online tool and steps to decode UTF-8 is not working, BUT an open source tool ...
Solo's user avatar
  • 3
0 votes
1 answer

How can I type 🧞 on Windows?

I tried: hold down Alt, press +, press the Unicode hex value 1F9DE, release Alt. This does not work because when I press the F key, it triggers the File menu (Alt+F hotkey) of the browser. In notepad ...
William's user avatar
  • 265
0 votes
1 answer

non-unicode fonts displaying as gibberish in Windows 10 guest running in qemu

I have a Hebrew app that uses non-unicode fonts, and when I run it in my Windows 10 guest, which is managed by virt-manager, it started displaying gibberish, although it used to work right. I tried ...
shmu's user avatar
  • 246
0 votes
2 answers

What is the official name for each of the unicode locale identifiers formats?

Unicode locale identifiers can be written in different formats en-us, en_US, and enUS, what is the official name for each of these formats? After searching, I found that the format en-us is called ...
Oussama's user avatar
5 votes
2 answers

How do I exhibit the character Ň (U+0147) in my Word 2010? Why the method described in the first paragraph below doesn't work for this character?

As far as I can understand, to exhibit a Unicode code point in Word, one should press the Alt key, followed by the character +  and the Unicode code point, both typed in the numeric keypad. That’s ...
Belloc's user avatar
  • 169
1 vote
1 answer

How to make one key stroke emit 3 unicode character in a custom keyboard layout in Linux mint?

I have created a custom keyboard layout for my regional language by editing different files. It works fine except for one problem. One of the characters needs 3 unicode characters in combination I.e. ...
Ritu Lahkar's user avatar
0 votes
1 answer

Interpret an Unicode value as real character in Notepad++

If I copy an Unicode encoded value as actual rendered character (for example form here - 1D400) and paste it into Notepad++, it really does show it as a "bold" character. But when I try to ...
Anka Petkova's user avatar
0 votes
0 answers

Git Bash doesn't display characters correctly, using Visual Studio Code's integrated terminal

I use Git Bash as the default terminal profile on Windows. When I execute this command, I get the below output: memlab run --scenario tests/oversized-object/index.js > 'command' �����ڲ����ⲿ���Ҳ���...
Lin Du's user avatar
  • 101
0 votes
0 answers

Unicode glyphs for wysiwyg ruler bar symbols (margin, left-tab,right-tab, center-tab and decimal-tab)

Various word processors and text layout software have a "ruler bar" to indicate margins, indentation, and tab locations for a block of text. The symbology used in ruler bars is fairly ...
Dan Menes's user avatar
  • 101
1 vote
1 answer

Bash prompt with zero width control characters

To prevent the prompt from warping around right-to-left text, I want to insert a zero width LRM into the bash prompt (U200e). Since this is a zero width character, my instinct was to wrap it with \[&...
Shachar Shemesh's user avatar
1 vote
1 answer

Correct way to encode mixed width text in Unicode?

From what I read, the fullwidth glyphs in Unicode are provided solely for backward compatibility and lossless roundtrip with legacy standards such as Shift-JIS. The rationale seems to be that Unicode ...
SuibianP's user avatar
1 vote
1 answer

Unicode sequence for custom character with dot underneath

Unicode has code points for certain Roman characters with a dot underneath, for instance U+1E04 — "Latin Capital Letter B With Dot Below" but code points don't exist for all letters. How can ...
ErikR's user avatar
  • 173
2 votes
1 answer

VIM uses wrong encoding - but only in status messages

I ran into a strange issue with my ArchLinux setup. Vim uses correct encoding for reading/displaying files but these status messages (which displays the current mode or reports back when the buffer is ...
Gabor Garami's user avatar
1 vote
2 answers

Excel treats unicode characters as text

I have received an excel file which contains data from government website, whose text is in Hindi language. When I open the csv file, it displays the hindi texts in unicode characters like the ...
Schmid's user avatar
  • 13
0 votes
1 answer

How to disable Ctrl+Shift+u keyboard shortcut (unicode-selector) in Fedora 39 KDE-Spin (Wayland)

0. How can I disable the Ctrl+Shift+u keyboard shortcut (unicode-selector) in Fedora 39 KDE-Spin (Wayland-Session)? Whenever I hit this key combination on a GTK+-app, the infamous underlined "u&...
LizLee's user avatar
  • 1
1 vote
2 answers

How to write correctly arbitrary fractions in Unicode?

How do i write correctly arbitrary fractions according to the Unicode standard for the OpenType feature frac ? My use case is: I write data in a text document(UTF-8) and have self written program ...
Tobias Wohlfarth's user avatar
3 votes
1 answer

How do I copy the URL contained in the Internet shortcut with unicode characters? (command line for many Internet shorcuts)

The subject was in part studied, there is 8 years (here : How do I copy the URL contained in the Internet shortcut?) I have thousand Internet shortcuts that I would like copy the URL contained to txt ...
dg lr's user avatar
  • 57
0 votes
1 answer

Can unicode characters requiring the number pad, e.g. U+3041, be typed using the On-Screen Keyboard?

Suppose that I want to type 'ぁ' *Unicode U+3041) using Window 10's On-Screen Keyboard without changing my language away from US English. Can it be done? As far as my research indicates, this would ...
J. Mini's user avatar
  • 216
1 vote
2 answers

Unicode characters became invisible since Windows 10 update

After a recent Windows 10 update, emoji became invisible in several apps (Google Chrome, MS Word). The behavior of MS Word is particularly weird: When I open a DOCX file that contains emoji (set with ...
root's user avatar
  • 1,533
0 votes
1 answer

If the `ffi` group of characters is smushed together into one character, then why I can select each of them separately?

In the video Why You Can Tweet More In Japanese: What Counts As A Character?, the author says that: Unicode has a single character for some English ligatures, like "ffi" - notice how the ...
Ooker's user avatar
  • 2,131
0 votes
1 answer

Notepad on Windows 11 not respecting Unicode monospace characters

I'm using box-drawing characters to represent trees. I need other people using Windows to see them so I can assume they all have Notepad installed. But even if I select Consolas as the font, here's ...
Juan's user avatar
  • 125
0 votes
0 answers

How to add multiple entries to MS-Word "Math AutoCorrect"?

In MS-Word Math AutoCorrect there are entries for all alphabets in Script type as below - Say, I want to add a similar suite to represent vectors with all the alphabets. Currently I can do it one ...
Anirban Chakraborty's user avatar
1 vote
2 answers

Website that used to display unicode properly no longer does so

This is a problem that I've encountered a couple times recently. Here is my most recent experience: Trying to browse, a page that contains japanese text. The ...
homework's user avatar
1 vote
0 answers

CJK unicode characters output from `wine reg query` displayed on terminal are fine. But they become question marks after pipe

When I run wine reg query 'HKEY_CURRENT_USER\Software\Akelsoft\AkelPad\Recent' on my machine, the output is fine and contains something like file0 REG_MULTI_SZ Z:\home\x\怎.txt on my terminal. ...
cshu's user avatar
  • 175
1 vote
1 answer

Problems with Word link to Excel when Windows is set to use UTF-8

I've some Word documents that have fields that are automatically updated from an Excel file using Words built-in links to Excel (as in copy > paste special > paste as link). After setting up a ...
Sergio Duarte's user avatar
0 votes
1 answer

Change encoding to Unicode in .csv exported from Gmail

I'd like to copy the typed-in addresses from Gmail to Outlook 2021 (Windows 10). To do this, I'm using the Google Contacts Export to Outlook CSV function and nk2edit. The problem is that the special ...
MorganInTheMorning's user avatar
1 vote
1 answer

Random unicode characters while typing normally in Windows 10

Sometimes when I'm typing in Windows I get some unicode-variant of the character I was trying to type. for example: ŕ instead of r ś instead of s ḳ instead of k etc. Things I've already tried: Try ...
webjuicefern's user avatar
7 votes
4 answers

Convert UTF-16 LE to UTF-8 in windows via command line

(question re-written to be more useful) I have a batch script which will interact with command line programs, take their output, and then perform decisions based on that output. One of the programs I ...
bfh47's user avatar
  • 103
9 votes
3 answers

fixing mis-encoded Chinese in file names

I'm saddled with a bunch of files whose names are garbled beyond recognition. Even though I more or less know what those names originally contained, fixing them by hand would involve a lot of hassle, ...
wildekat's user avatar
0 votes
0 answers

In Excel, how do I handle someone else's workbook that uses UNICODE?

I have been sent an Excel workbook in which I find that a cell that appears to contain the letter "A" produces FALSE if I compare it with "A". That is, if the cell in question were ...
tkp's user avatar
  • 464
0 votes
0 answers

Is the dataset that underpins the Windows Character Map utility accessible somewhere on the system?

The Windows Character Map utility displays the glyphs available in the chosen font, along with the Unicode codepoint and a name for a selected character. Is the dataset that underpins this utility ...
Tim's user avatar
  • 861
2 votes
0 answers

tmux inside Cygwin mintty - unicode characters broken depending on command

Please help me understand what's going on with some unicode characters looking right inside mintty + tmux in some command outputs, but not in others. Sample commands: for c in 00AE 1F007 1F32D 1F603; ...
EndlosSchleife's user avatar
0 votes
1 answer

Missing some Unicode glyphs on Windows 7, is there way to fix/patch it?

I have an arrowsbe.txt file, with the following content (in hex): FE FF 00 41 00 42 00 43 2B A1 00 20 2B A2 You see, it is UTF-16BE encoded, with BOM at file start. There are two not-so-often seen ...
Jimm Chen's user avatar
  • 6,014
1 vote
0 answers

Unicode letters replaced by ***** (letter n )

the letter n is being replaced with **** in my console when I run dbt. In text format: 11:59:24 Fou*****d 397 models, 15 tests, 0 s*****apshots, 0 a*****alyses, 639 macros, 0 operatio*****s, 11 seed ...
Flyingfish's user avatar
0 votes
1 answer

unicode-escaping with a standard Unix tool

I have cooked up the following python script, it does 1 thing and it does it well. It does, whatever Unicode magic unicode-escape does in python3 to some text. #!/usr/bin/env python3 import sys text ...
Jachin's user avatar
  • 111
1 vote
1 answer

Convert Korean files that are showing up incorrectly to utf-8 - character shows Çѱ¹Ÿî

I was just about to ask this after a long time of searching so decided to answer my own question... I downloaded Korean subtitles in an .smi file that was in zip archive. When I extracted it, the ...
iateadonut's user avatar
7 votes
3 answers

What is a quick way to write "dagger" sign in MS Word equation mode?

In MS Word equation mode with the Unicode converter, I write symbols by putting the "\" sign and then typing the specific Unicode, i.e., for the summation sign I write "\sum" and ...
Luqman Saleem's user avatar
3 votes
1 answer

How can I set my system's default encoding to UTF-16?

My daily activity involves usage of English, French, Spanish, and when I save personal copies of web pages or other documents, the full character range of those languages finds its way into filenames ...
Eric Marceau's user avatar
1 vote
1 answer

zmv to rename Unicode characters (e.g. u0308 'COMBINING DIAERESIS') on Mac/zsh-shell

While cleaning a file server I found unwanted or non-ASCII characters in many filenames. To rename unwanted filenames, I normally use the comfortable zmv command in a zsh shell on an OSX machine. To ...
mfuerli's user avatar
  • 11
0 votes
0 answers

Problems with certain Unicode characters

I am using word, and I needed to insert the name of the Chinese Nüshu script in Nüshu (𛆁𛈬), there wasn't a font that supported it, so I downloaded every font I could find that supported it. Noto ...
LEG's user avatar
  • 1
2 votes
0 answers

Is it possible to write the Lovecraftian Sigil of the Gateway in unicode?

It isn't technically copyrighted and doesn't seem to be subject to trademark, and is kind of a cultural symbol, so for all I know there's a hidden symbol assignment for it somewhere... but the Sigil ...
Michael Macha's user avatar
0 votes
1 answer

Can't copy chinese text from PDF

I tried to open it with google docs, and tried the site i2orc to convert the file. But in both cases the Chinese characters are missing from the output, just as when i copy/paste them from the ...
Vitor Pinheiro's user avatar
2 votes
0 answers

Not able to copy Korean characters from PDF file using Acrobat Reader

I decided to study Korean and I found a book that I want to copy some vocabulary from, the problem is when I paste the copied text, it turns into gibberish characters. Any idea how to solve this? I ...
Wax's user avatar
  • 121
5 votes
1 answer

Unicode backspace key symbol ⌫

Is there a Unicode symbol for the backspace key ⌫, the x inside a left-pointing arrow? I know that Unicode has a "BACKSPACE" control character (U+0008) that it inherited from ASCII, and it ...
Jo Liss's user avatar
  • 4,348
0 votes
1 answer

Binding unicode λ symbol (code+glyph) to key combination

I would like to type Unicode λ in a clear way (key L and alt modifier, for example) and not pressing SHIFT+CTRL+u03bb each time (or doing a poor trick like copy/paste). I've tried a lot of ways to ...
Daniel Bandeira's user avatar
1 vote
1 answer

How can I type rough breathing marks in Classical Greek using the Windows 11 Greek Polytonic keyboard?

I'm learning to use the Greek Polytonic keyboard in Windows 11 to type Classical Greek. I can successfully type the following characters (using α as an example): ᾶ, ἀ, ἄ, ἂ, ά, ά, ἆ, ᾱ. For example, ἄ ...
user3457004's user avatar
0 votes
1 answer

Why can't I print this in windows ( ه҈҉҉҉҉҉҉҉҉҉҉҉҉҉҉҉҉҉҉҉҉҉҉҉҉҉҉ ')?

I got this from the following question: What are these ? ه҈҉҉҉҉҉҉҉҉҉҉҉҉҉҉҉҉҉҉҉҉҉҉҉҉҉҉ I can't get this to print in windows powershell. I tried using chcp 65001. BTW I can print 北京 but not ه҈҉҉҉҉҉҉҉҉҉҉҉...
dCoder's user avatar
  • 101
4 votes
0 answers

Why is it so HARD to have Windows 10 CMD console display Unicode correctly?

Before showing the hard(difficult) example of Windows 10 (Win10.21H2 in this case), let me show an easy example by Ubuntu 22.04 . I place file names with Chinese and Korean characters on a USB flash ...
Jimm Chen's user avatar
  • 6,014

2 3 4 5