Do-It-Yourself Web Authoring - a beginner's HTML tutorial

Sample image
A random photo... (The Hudson River at 125th Street about 2002)

Frank da Cruz
Updated in 2019 and 2021 for HTML5 and "fluidity".

This page shows how to create Web pages by hand, the original way. Although today most Web pages are created by "Web authoring systems" that are designed to shield you from technical details, the fact is that HTML (the "programming" language of the Web) is not that difficult, as you can see if you follow this tutorial. To get an idea of what is possible with this technique, see these 100% hand-made websites:

CONTENTS

  1. Creating a Web Page
  2. HTML Syntax
  3. Special Characters
  4. Converting Plain Text to HTML
  5. Effects
  6. Lists
  7. Links
  8. Tables
  9. Viewing your Web page
  10. Installing your Web Page on the Internet
  11. Where to go from here
  12. Postscript: Cell Phones

You can create a Web page on your desktop computer but nobody can see it but you. If your want other people to be able to see your Web pages, you need an account on a computer that has a Web server. Nowadays most people have their own computers on their desks, but normally they don't have Web servers and anyway you don't want the whole world coming into your desktop computer to see your web page because (a) it's not designed for that, and (b) who knows what else they might see. And (c) for security reasons, Web servers should be managed by professionals. Most institutions have big central shared computers for this purpose, which usually have a Unix-like operating system such as Linux. You need an account on one of these so you can put your web pages there. If you don't have access to such a computer, you can get a low-cost account on a service like Panix.com.

You can still create Web pages on your own computer and look at them with your computer's Web browser, but for other people to see them, you have to upload them to the "big" computer that has the Web browser. The rest of this document is about how to create your first Web page.

1. Creating a Web Page

This page was typed by hand. Anybody can do this, you don't need any special "web creation" tools or HTML editors, and the pages you make can be viewed from any browser. To see how this page was made, choose View Source (or View Page Source, or View Document Source) in your browser's menu (or — in at least Chrome and Firefox — Ctrl-U on your keyboard). A simple web page like this one is just plain text with HTML commands (markup) mixed in. HTML commands (properly called "tags") themselves are plain text.

When you're just learning and want to experiment, you can do everything on your PC. Create a new directory ("folder") for your website, and then put the web-page files (HTML plus any pictures) in it. Use NotePad or other plain-text editor (not word processor) on your PC to create your "home page", a file named index.html, which you can view locally with your Web browser. (You can also use a word processors such as Word or WordPad if you save in "plain text", "text", "text document", or "text document MS-DOS format".) Later I'll explain how you can install your web site on the Internet.

Once you've made your "home page" (index.html) you can add more pages to your site, and your home page can link to them.

2. HTML Syntax

Web pages are written in Hyper Text Markup Language (HTML). HTML has three special characters: <, &, >. An HTML command is enclosed in <...>, for example <p>, which is a paragraph separator, or <b> ("begin bold") and </b> ("end bold"). So the following HTML text:
This sentence contains <b>bold</b> text.
produces:

This sentence contains bold text.

A Web page starts with a series of HTML commands, and ends with a few more. The contents go in between:

<!DOCTYPE HTML>
<html lang="en">
<head>
<META charset="UTF-8">
<META name="viewport"
 content="width=device-width, initial-scale=1.0">
<title>Sample Web Page</title>
</head>
<body>

(Contents go here)

</body>
</html>

The first line (DOCTYPE) specifies which markup language the page uses (HTML = Hypertext Markup Language); just copy this line. The next line, <html lang="en">, starts the page and specifies the (human) language it is written in (language codes are specified here), and is matched by the line </html>, which closes the page. <head> starts the heading, which contains a title to be displayed on the browser's title bar and a declaration of the character set (nowadays it should always be UTF-8) and the "viewport" line which is a compulsory adaptation for cell phones, "smart" watches, etc. </head> closes the heading. The head can also contain other items such as style parameters that you can learn about later; for example by asking Google ("HTML how do I change the font size?").

The <body> tag starts the body of the document, is closed by </body> tag.

As you can see, most HTML commands come in begin-end pairs: <b>...</b>, <head>...</head>, etc. The closing part of the command has a slash (/) between the < and the first letter of the command.

Blank lines and line breaks are ignored. The browser automatically "flows" your text into lines and paragraphs that fit in its window. Paragraphs must be separated by <p>. Line breaks can be forced by <br>.

Example for Windows:
Use the mouse to copy the HTML above into NotePad. Then save the file (File -> Save As...) in your Web directory as index.html. Suppose your Windows username is Olga. Then (depending on which version of Windows you have) this might be:
C:\Users\Olga\Desktop\Web\index.html
Now to see your new web page, just double-click on the Web folder and then double-click on index.html.

Now you're ready to start adding "content" to your web page. Go back to NotePad and replace the title and "(Contents go here)" with whatever you want. Any time you want to see the result, use File -> Save in NotePad and then click the Reload button on your browser.

The next sections tell how to achieve different kinds of effects.

3. Special Characters

HTML special "character entities" start with ampersand (&) and end with semicolon (;), like "&euro;" = "€". The ever-popular "no-break space" is &nbsp;. There are special entity names for accented Latin letters and other West European special characters such as:

&auml; a-umlaut  ä 
&Auml; A-umlaut  Ä 
&aacute; a-acute  á 
&agrave; a-grave  à 
&ntilde; n-tilde  ñ 
&szlig; German double-s  ß 
&thorn; Icelandic thorn  þ 

(The table above is shown in the basic, default style of HTML. Of course there are many ways to customize the appearance of tables; more about this below.

Examples:
For Spanish you would need:
&Aacute; (Á), &aacute; (á), &Eacute; (É), &eacute; (é), &Iacute; (Í), &iacute; (í), &Oacute; (Ó), &oacute; (ó), &Uacute; (ú), &uacute; (ú), &Uuml; (Ü), &uuml; (ü), &Ntilde; (Ñ), &ntilde; (ñ); &iquest; (¿); &iexcl; (¡).
Example: Añorarán = A&ntilde;orar&aacute;n.

For German you would need:
&Auml; (Ä), &auml; (ä), &Ouml; (Ö), &ouml; (ö), &Uuml; (ü), &uuml; (ü), &szlig; (ß).
Example: Grüße aus Köln = Gr&uuml;&szlig;e aus K&ouml;ln.
CLICK HERE for a complete list. When the page encoding is UTF-8, which is recommended, you can also enter any character at all, Roman, Cyrillic, Arabic, Hebrew, Greek. Japanese, etc, either as numeric entities or (if you have a way to type them) directly from the keyboard.

And remember: if you want to include <, &, or > literally in text to be displayed, you have to write &lt;, &amp;, &gt;, respectively.

4. Converting Plain Text to HTML

If you have a plain text file that you want to convert to HTML, load the file into a plain-text editor and then follow these steps.

  1. Change all occurrences of "&" to "&amp;".
  2. Change all occurrences of "<" to "&lt;".
  3. Change all occurrences of ">" to "&gt;".
  4. Change any accented letters to HTML entity names (previous section)*.
  5. Put "<p>" between each paragraph.
  6. Insert the standard prolog at the top, substituting an appropriate title.
  7. Add </body> and </html> at the end.
  8. Save the result as xxx.html, where xxx is the part of the original file's name before the dot, or whatever-else-you-want-to-call-it.html.

If you are a Kermit user, you can find a script to convert plain text to HTML HERE.

If the text contains lists, tables, or other structures, read on.

If you have a Microsoft Word document you want to convert to HTML, and your copy of Word does not allow the file to be "Saved As" HTML, then save it as plain text and follow the same instructions. In this case you lose the "richness" (bold, italics, font changes, etc) when you save the file, and will have to put the effects back by hand (next section).

* Not necessary if your text is already encoded as UTF-8. If it's not UTF-8, you can identify the encoding in the <META charset="..."> directive, but this topic is a bit advanced for this simple tutorial.

5. Effects

The rest of this document shows some of what you can do with simple HTML commands, but I don't explain how to do it. To see that, just tell your browser to View Source and compare the HTML in the source window with the result in the original window.

Note: In this and the following sections, I use some "deprecated" features from earlier HTML versions because they are easier for beginners to learn (for example <big>...</big> versus <span style="font-size:120%">...</span>).

This sentence is bold. This sentence is in italics. This sentence is in bold italics. This sentence is in typewriter font. This sentence has underlined words and underlined bold words. This sentence has colored words. This sentence has big words. This one has very big words. This one has very small words.

This is a "blockquote", which is like a regular paragraph, but it has bigger margins. Begin a blockquote with <blockquote> and end it with </blockquote>. Environments such as blockquotes, lists, etc, that have a beginning and an end always use paired commands like <blah>...</blah>.

This is a blockquote inside another blockquote, which shows how HTML environments can be "nested".

Here we are back in the first blockquote again.

And here we are back outside of the first blockquote.

6. Lists

Here is an Unordered (bullet) List (<ul>..</ul>): Here is an Ordered (numbered) List (<ol>..</ol>):

  1. This is a List Item (<li>).
  2. This is another item.
  3. This is yet another item.

And here is a Description List (<dl>). using Kermit commands as an example:

SET FILE TYPE BINARY
This command tells Kermit to transfer files in binary mode. In other words, don't mess with the file, just send it as-is. The result on the receiving computer should be identical to the original.

SET FILE TYPE TEXT
This command tells Kermit to transfer files in text mode. This should be used with plain-text files, especially when transferring them between computers with different file formats or operating systems, such as VMS and Unix, or Unix and Windows. It converts the file's format and character-set (if necessary) so the received file is usable on the destination computer.

You can have lists within lists:

  1. A gromet
  2. A widget
  3. A framus, which consists of the following components:
  4. A doodad.

And you can have ordered lists that use letters instead of numbers:

  1. Pennies
  2. Nickels
  3. Dimes
  4. Quarters
Links can be internal within a Web page (like to the Table of Contents at the top), or they can be to external web pages or pictures on the same website, or they can be to websites, pages, or pictures anywhere else in the world.

Here is a link to the Kermit Project home page. And here is what the HTML looks like:

<a href="http://www.kermitproject.org/">
Kermit Project home page
</a>

The part inside the quotes is called the URL (Uniform Resource Locator). Here is a link to Section 6 of the page you are reading, and the HTML:

<a href="#lists">Section 6</a>
The "#" indicates an internal section ID, in this case:
<h3 id="lists">6. Lists</h3>

Here is a link to Section 4.0 of another document, at another website; the C-Kermit for Unix Installation Instructions. And the HTML:

<a href="http://kermitproject.org/ckuins.html#x4.0">
Section 4.0
</a>

Here is a link to a picture: CLICK HERE to see it.

If you want to link to a particular section of somebody else's Web page, visit the page, "view source", search for the text at that spot and see if there is an "id=" clause; if so, use the ID as shown just above; if not you're out of luck.

If you want to link to a particular page of a PDF document, just put "#page=123" (replace by the desired number) at the end of the URL.

8. Tables

Here's a simple table with some headings and a few rows:

Heading A Heading B Heading C
Cell 1A Cell 1B Cell 1C
Cell 2A Cell 2B Cell 2C
Cell 3A Cell 3B Cell 3C

Same table again but with borders:

Heading A Heading B Heading C
Cell 1A Cell 1B Cell 1C
Cell 2A Cell 2B Cell 2C
Cell 3A Cell 3B Cell 3C

The appearance with double borders is the default (and therefore easiest) table style. You can use table attributes to change the appearance.

Here's the same table again but with Column C right-adjusted:

Heading A Heading B Heading C
Cell 1A Cell 1B Cell 1C
Cell 2A Cell 2B Cell 2C
Cell 3A Cell 3B Cell 3C

And finally, here it again with some "style" parameters applied to get rid of the ugly double borders, which you can see in the <style> section of the <head> at the top of this page, if you "view source".

Heading A Heading B Heading C
Cell 1A Cell 1B Cell 1C
Cell 2A Cell 2B Cell 2C
Cell 3A Cell 3B Cell 3C

So with just three lines added to the <style> section at the top of the page, you can make all your tables look better.

9. Viewing Your Web Page

Anyway, back to basics. If you make a simple index.html in your Web directory like:
<!DOCTYPE HTML>
<html lang="en">
<head>
<title>My first web page</title>
<META charset="utf-8">
<META name="viewport"
 content="width=device-width, initial-scale=1.0">
</head>
<body>
<h2>This is a heading</h2>
And this is some text.
</body>
</html>
Then if you double-click on index.html, it will open in your Web browser.

Now you can work on your page's <body>: add more text, add some images, add some links, add subheadings, some lists, some tables, whatever you want. Each time you make a change, reload the page in your browser (usually done by clicking on the symbol, or typing Ctrl-R).

By the way, a web page can have any name at all, it doesn't have to be index.html. Index.html is a special name that is used for the "home page" of a website. To open a web page that has some other name, right-click on the filename and then choose "Open with..."; then click on your Web browser's name.

10. Installing Your Web Page on the Internet

How to put your web page on the Internet depends on your Internet Service Provider (ISP). At Columbia University, each user has a "shell account" on the central server, which runs a Unix-based operating system, and which you can access with a terminal emulator such as Kermit. Here's an example that applies to Columbia University's web server, showing how to upload your files from Windows:

There are easier ways to do this than what I describe below, but they require add-on software. The following method should work for everybody who has Windows and an Internet connection.

If you create a public_html subdirectory of your login directory, give it "world" read and search permission, and then create an index.html file in that directory and give it world read permission, you'll have a home page. In this example "$" is the shell prompt (yours might be different), and what you type is underlined. CAUTION: the directory name is public_html but the underscore might be obscured the underline in the examples below. Whenenever typing "public_html" always include the underscore. CAUTION#2: Some Web hosting sites might use different a different name for the user's Web directory.

$ cd                      (Change to your login directory)
$ mkdir public_html       (Create public_html subdirectory)
$ chmod 755 public_html   (Give it world read/search permission)
$ cd public_html          (Enter the public_html subdirectory)

You only have to do this part once. Remember, it's public_html with an underscore, which tends not to show up when a command is underlined.

Let's assume you have created a website in the Web folder on your PC. Here's an example of how to upload your Web files to your public_html directory on Columbia University's Cunix server using FTP (File Transfer Protocol). First start the FTP program:

Start -> Run

and type "ftp" in the box. An FTP window opens and an "ftp>" prompt appears. Type the underlined commands at the "ftp>" prompt (substituting your own user ID, etc):

ftp> lcd Desktop
Local directory now C:\Users\olga\Desktop.
ftp> lcd Web
Local directory now C:\Users\olga\Desktop\Web.
ftp> open cunix
Connected to cunix.cc.columbia.edu.
220 Cunix FTP server (Version 5.60) ready.
User (cunix.cc.columbia.edu:(none)): olga
331 Password required for olga.
Password: (type your password here)
230 User olga logged in.
ftp> cd public_html ("public_html" with underscore)
ftp> binary
ftp> put index.html
200 PORT command successful.
150 Opening BINARY mode data connection for index.html.
226 Transfer complete.
ftp: 285 bytes sent in 0.00Seconds 285000.00Kbytes/sec.
ftp> site chmod 644 index.html
200 CHMOD command successful
ftp> bye
This sends the index.html file to your public_html directory on the server. You can send any other file by substituting its name for "index.html. If you want to send all the files in your Web folder, replace "put index.html with "put *" (asterisk, meaning "all files" in this directory). Always use binary mode unless you know what you're doing.

If the "site chmod" command failed (this service is not supported by some FTP servers), you have one more step. Before others can see your web files, you have to give them "world read" permission. Again, log in to the server using a terminal emulator (Telnet, SSH, Kermit, whatever), and:

$ cd ~/public_html        (Enter the public_html subdirectory)
$ chmod 644 *             (Make all files publically readable)

Now you have a home page. If you were at Columbia and your login ID was "olga", the address (URL) of your home page would be:

http://www.columbia.edu/~olga/

If you want to add pictures to your Web page, you can upload those too (also with Kermit or FTP), and you also have to "chmod 644" all the files to make them readable by everybody. Every time you add new files to your public_html directory, you have to "chmod 644" them so they are accessible, either in the FTP session itself (as shown previously), or by logging in to the host and:

$ cd ~/public_html ("public_html" with underscore)
$ chmod 644 *

Pictures should be in JPG or PNG or GIF format. To include a picture ("image") in your page, include a sequence like this at the desired spot:

<img src="filename" alt="brief description">

Replace filename by the name of the file (e.g. skyline.jpg). Almost every HTML tag can be customized by "attributes" in the begin tag. For example if you want the image to scale itself to the viewer's window (on a computer, cell phone, or other device), and furthermore you want the text of the page to flow around it, you can do:

<img alt="brief description" style="width:50%; max-width:480px; float:left; margin:10px;" src="filename">

You can look up the attributes in Google, just search for html width, html float, etc.

Now you have your own home page on the Web, and your own URL (Uniform Resource Locator, or Web address). In this example, the URL is:

http://www.columbia.edu/~olga/

Of course, if you prefer, you can also do all the Web-page editing directly on the server, using an a server-based text editor like EMACS or Vi while logged in to the Unix shell. In that case you don't need to upload anything (except maybe photos), but then you also need to be more familiar with the server's Unix environment and commands and utilities.

11. Where to go from here

Most Web pages are created by hideous bloated "Web Authoring" tools, which are usually designed to hook you (and readers of your web pages) into some corporate profit-making scheme. If all you want is text with some pictures and links, some section headings, and maybe some tables, as opposed to spinning blinking popup holograms with streaming video, sound effects, etc, it's best to keep it simple and do it yourself. This is how the Web started off in the HTML 1.0 days of the early 1990s. The ingenious thing was that it was self propogating. If you saw a web page with a certain effect and wanted to know how it was done, you could simply "view source" to get the "source code" and then adapt it to your own page. You can still do that with pages that look like this one, but since most Web pages are no longer made by hand, you'll often see tons of incomprehensible gibberish (the more special effects, the more gibberish), for example at CNN.

Anyway, if you have mastered the simple techniques shown in this page, you know the basics. Which is more than can be said of many "web designers" who only know how to use prepackaged software to create web pages by picking things from menus and moving things around with a mouse. To go further, you can almost always find out how to do what you want by searching Google ("html how do I ...?"), or looking at the HTML code of different websites (browser "view source" command) but, again, only for pages that look like this one.

Of course HTML is a standard, and here are the official references:

You can check the validity of your web page at the W3C Markup Validation Service. Note that this page itself does not pass the Validator because it uses a number of "obsolete" elements. That's because (a) they are much easier to explain, and (b) they still work. For the first 20 years or so HTML was in constant flux, but with the release of HTML5 in 2014, it seems to be pretty stable.

If you have made mistakes it will let you know, and if you have used "old" or "deprecated" HTML features it will let you know that too, and usually also suggest a modern replacement.

12. Postscript: Cell Phones

The original Web was composed of pages designed to fit on desktop computer screens, which, over time, became wider and wider. But then suddenly they also had to fit on miniscule cell-phone and even "smart watch" screens. The main pitfall is that an image might be too wide for the screen, so the image width should be specified as a percentage of the viewport width, e.g.:
<img alt="Brief description" 
 title="Slightly longer description"
 src="picture-of-something.jpg"
 style="width:100%;">
Text, on the other hand, usually just flows to accomodate the viewport. But if your page includes text that must not be "wrapped" (for example, program source code, poetry, computer dialog transcripts), you have to enclose such sections within:

<div style="overflow-x:auto; white-space:pre">
material that must not wrap
</div>

as has been done in several places above, in which case a horizontal scroll bar will appear automatically if the non-wrapping text is wider than the viewport. If you are viewing this page on a wide screen, you can see this effect if you squeeze your browser horizontally to its minimum width and then scroll through this page.

(End)

Frank da Cruz
Page created: 1992
Last update: 17 September 2021