Characters in C
The char Data Type
The `char` type stores a single byte (code unit) from the program's execution character set. It is typically 1 byte of memory, but the bit-width of a byte is implementation-defined (at least 8 bits).
`char` holds an integer value corresponding to a character code (often ASCII on many systems). Because it is an integer type, you can do arithmetic on it (e.g., `'A' + 1` yields `'B'` on ASCII-based systems).
Note: `char` by itself may be either signed or unsigned depending on the compiler/platform.
ASCII Representation
ASCII (American Standard Code for Information Interchange) maps characters to integer values. Many modern C environments use ASCII (or ASCII-compatible encodings such as UTF-8), so the following mappings are common:
For example (ASCII): 'A' = 65, 'B' = 66, 'a' = 97, '0' = 48.
This is why `printf("%d\n", 'A');` prints `65` on ASCII-based systems.
Character | ASCII Value |
---|---|
'A' | 65 |
'a' | 97 |
'0' | 48 |
' ' (space) | 32 |
Character Literals vs Strings
A single character literal uses **single quotes** → `'A'` and has type `int` in expressions (it is an integer constant with the character code).
A string literal uses **double quotes** → `"Hello"` and has type `array of char`, automatically terminated by a null byte `\0`.
Example: `char s[] = "Hi";` has three bytes: `'H'`, `'i'`, and `\0`.
Signedness of char
`char` may be signed or unsigned depending on the implementation. Values above 127 may appear negative if `char` is signed.
When passing a possibly negative `char` to `
Common Escape Sequences
Character literals can also represent non-printable characters using escape sequences.
Literal | Meaning |
---|---|
'\n' | newline |
'\t' | horizontal tab |
'\'' | single quote |
'"' | double quote |
'\\' | backslash |
'\x41' | hex code 0x41 ('A') |
'\101' | octal code 101 ('A' in ASCII) |
Example
#include <stdio.h>
#include <ctype.h>
int main(void) {
char letter = 'A';
char next = (char)(letter + 1); // 'B' on ASCII-based systems
printf("Character: %c, ASCII: %d\n", letter, (int)letter);
printf("Next: %c, ASCII: %d\n", next, (int)next);
char c = '\n';
printf("Escape example: newline as int = %d\n", (int)c);
// ctype example: always cast to unsigned char
char maybe = 'z';
if (isalpha((unsigned char)maybe)) {
printf("'%c' is alphabetic\n", maybe);
}
return 0;
}
Character: A, ASCII: 65 Next: B, ASCII: 66 Escape example: newline as int = 10 'z' is alphabetic
Best Practices
• Use single quotes for character literals (`'X'`) and double quotes for strings (`"X"`).
• Avoid assuming all environments use ASCII; many are ASCII-compatible, but portability matters.
• Be aware that `char` may be signed or unsigned; cast to `unsigned char` for `
• Use `%c` to print characters and `%d` to print their integer codes (cast to `int` to be explicit).
• For full Unicode characters beyond ASCII, `char` arrays will typically hold UTF-8 bytes; handling full Unicode may require multibyte/wide character APIs (`