Pronouns in technical writing.

When doing technical writing, or for that matter most forms of writing , we need to be able to refer to people without always identifying their gender.  The most common reason, at least in my writing, is that I am often speaking of some indeterminate person whose gender simply does not matter.  In my technical writing, I often speak of some generic DBA and what the best approach they could take to solving a problem is.  In my legal writing, I often speak of some generic person whose gender is irrelevant.  Often, the best approach is to recast the sentence in the plural, but when that is awkward or undesirably, I prefer the singular “they”.

While still somewhat controversial, and possibly not an ideal solution, the singular “they” seems the best answer to English’s lack of a good singular pronoun for people that is not gender specific.  It has been in use for quite a while, and its usage is becoming more accepted.  The Washington post recently allowed the singular they, although in somewhat limited circumstances, in a memo.  Merriam-Webster.com hosts a video discussing using “their” as a singular possessive pronoun.

There are several other options that could be used instead of the singular “they”.  “He or she” is commonly used in many places.  In one article I recently published, the article editor and I discussed whether to use “he or she” or “they” before we finally settled on “they.”  While somewhat a matter of taste, I find “he or she” to be unnecessarily long and wordy.  It becomes tiresome, especially when used several times in an article.  I think things like “s/he” and “he/she” are worse.  They are awkward when reading and not directly pronounceable.

An author could simply use one of “he” or “she” as a pronoun referring to a generic person without any implications for their gender.  There is in fact some indications, according to Wikipedia, that it is a long tradition for “he” to be used generically without specific reference to males.  But that has its own complications.  Aside from not being inclusive on its face, it could lead the reader to think the author is specifically referring to one gender or the other.

Perhaps the best solution would be to introduce a new word in English.  Attempts have been made to do this with proposals for “zhe”, “thon”, and “co”, but none of those have been widely adopted or accepted.  Unless English standardizes on a new addition, I think the singular “they” is the best choice for a gender neutral pronoun when the sentence cannot be gracefully rewritten to avoid the pronoun entirely.

New Publications and a Video

The people at Webucator were kind enough to make a video based on my article on dealing with SQL Server Lock Resource Errors. While I am certainly biased, I think the video came out quite well.  They also have a series of courses on SQL Server available here and their YouTube page with other videos is here.

Additionally, MSSQLTips has published another piece of mine dealing with Programmatically Changing SQL Server Linked Tables in Microsoft Access.

I also have a piece of microfiction up at SpeckLit.  It is titled Food Chain.  They have another piece of mine forthcoming that will be available here when it is published.

[Edit 25 Nov. 2015: Fixed a broken link, and put a title in proper title case]

Making Changes to Many Forms In Access

Although generally I prefer to create custom interfaces for my databases using fully developed programming languages like Python or C#. I prefer to have full access to their computational capabilities and in particular their graphing even when the initial plan does not need these. Projects have a tendency of expanding over time. But when I need to quickly create a simple front end for a database I sometimes turn to MS Access.

One issue I have faced occasionally is being asked to make the same change to all of the forms or all of the reports. Fortunately, the VBA built into Access makes this relatively simple for certain types of changes. You can iterate through all forms, make the desired change, and then save that change so it is permanent.

This is best described with an example, though I have only tested this with Access 2010 and 2013. For this example, I am just going to create a database with one table and two forms. I will put the VBA code itself into a module. The code will look like:

Public Function ChangeBkColor()
    Dim obj As AccessObject 'Entities in Allforms are AccessObjects
    Dim CurForm As Form
    
    'We want everything to be closed to avoid any problems
    'with reopening later
    For Each CurForm In Application.Forms
        DoCmd.Close acForm, CurForm.Name
    Next CurForm
    
    For Each obj In Application.CurrentProject.AllForms
        'If not opened as design, it may not save the change
        DoCmd.OpenForm obj.Name, acDesign
        'There should only be one open form now
        Set CurForm = Application.Forms(0)
        'Change the color to red, but could change others
        CurForm.Section(acDetail).BackColor = RGB(100, 0, 0)
        'Setting acSaveYes here is significant
        DoCmd.Close acForm, CurForm.Name, acSaveYes
    Next obj

End Function

I normally execute by highlighting the code in the VBA editor and then pressing F5, but it could also be tied to a macro if that is more convenient. Executing it then opens each form, changes the property in question, and then closes it again, saving the change. Using techniques like this has saved me a fair bit of time in trying to standardize certain things across numerous forms in a database.

SampleResults

SampleChange

Keep it short

I have been thinking about resumes a great deal lately. Since I recently passed the Bar Exam, I have been revising and sending out my resume in search of a legal job. At my current job, I have also fairly recently reviewed several resumes from new applicants. I cannot claim to be a resume expert, and I will not try to give any detailed advice. But there is one thing that many people that have given me advice have all agreed on, and that I wish the people sending in resumes that I have to review had remembered.

Keep it short.

When I looked into resumes before while looking for technical jobs and more recently when I was looking into them for legal jobs, virtually everyone recommended keeping them short. When I review resumes, I distinctly prefer them short. Exactly what short means varies somewhat by who is giving the advice, but some of them recommend strictly keeping it down to one page and a few others say that two is acceptable. When I have reviewed resumes at different times, a second page has never bothered me. But I have been rather annoyed when I have reviewed resumes as long as sixteen pages.

Generally, keeping it short will help ensure that the important information is easy to find and review. The unimportant information does not need to be there at all. There are occasions when a potential employer wants something longer and more detailed. However, those employers will generally make that clear by asking for a C.V. or even a portfolio rather than a resume. With a bit of help from a skilled counselor at my law school, I ruthlessly trimmed my resume down to one page, and I think it is stronger now that it is shorter.

Dealing with SQL Server LOCK resource errors.

I was recently doing some testing that required a fairly large table. I created the test table, and set it up to insert about 30 million rows. I got an error that read:

The instance of the SQL Server Database Engine cannot obtain a LOCK resource at this time. Rerun your statement when there are fewer active users. Ask the database administrator to check the lock and memory configuration for this instance, or to check for long-running transactions.

Naturally, having not encountered that before, I started by Googling the error and looking up Error 1204 and Microsoft’s Support Article. The support article helpfully suggested setting locks to 0 using sp_configure. But that is the default value and mine was already set that way.

So, I created a simplified setup that would reliably produce the error based on Jeff Moden’s Tally Table Article.

if OBJECT_ID('BigTallyTable', 'U') is not NULL

drop table dbo.BigTallyTable

create table BigTallyTable
(n BigInt primary key)


insert into dbo.BigTallyTable(n)
select top (30000000)
row_number() over (order by s1.name) as n
from master.dbo.syscolumns s1,
master.dbo.syscolumns s2

And I started playing with it. Since this all tied to locks, one thing I did was break it up into batches:

declare @maxValue int, @maxperround int, @last int
set @maxValue = 30000000 --30 Million
set @maxPerRound = 500000


set @last = 0
while @last < @maxValue
begin

if @maxPerRound > @maxValue - @last
set @maxPerRound = @maxValue - @last

insert into dbo.BigTallyTable(n)
select top (@maxPerRound)
row_number() over (order by s1.name) + @last as n
from master.dbo.syscolumns s1,
master.dbo.syscolumns s2

Print 'Finished a round with @last = ' + cast(@last as varchar) + ' and @maxPerround = ' + cast(@maxPerRound as varchar)

set @last = @last + @maxPerRound
end

This avoided the error and generated the results I was looking for. On a multi-user system (I was doing this on my test instance of SQL which is on my personal machine) it would have also allowed other queries that were waiting for locks to be able to execute more freely, though whether that is necessarily good or not depends on the circumstances.

I also got it to lock by directing SQL Server to take a table lock right from the beginning.

insert into dbo.BigTallyTable with (TABLOCK) (n)

select top (30000000)
row_number() over (order by s1.name) as n
from master.dbo.syscolumns s1,
master.dbo.syscolumns s2

This ran substantially faster than splitting it into batches did and is clearly a simpler query. It does take a more aggressive table lock which, as the article on lock escalation points out, can cause more problems with blocking other queries if that is a concern.

Order of a Result Set in SQL

I have seen people build procedures that rely on a result set being returned in a certain order.  This is fine if an order by clause is used, but SQL Server will not guarantee the order of the results without one.

It is interesting to look at the way SQL Server normally returns a result set though.  So, lets make a small sample table:

create table orderTest (n varchar(15))

insert into orderTest (n)
values  (2),
(1),
(3),
(1)

If we run a simple “select n from orderTest” the results I get back are 2, 1, 3, 1.  In other words, it returns the results in the order they were inserted.  This is relatively consistent.  I have tried similar tests on multiple versions of sql on multiple machines.  As I said at the beginning, I have seen some programs rely on the data being returned in the order it was inserted and those programs have gone for extended periods of time without any problems.  But I want to emphasize that SQL Server will not guarantee the order without an order by clause.  If you want to be able to guarantee that you can get the data back in the order that it was inserted in then you can add an identity column and order by that.

The tendency to return things in the order they were inserted seems fairly stable with simple where clauses, but asking SQL Server to return distinct results will change the order as it checks for duplicates.  For instance “select distinct n from orderTest” gives me 1, 2, 3.

And SQL Server seems to return the results of a union all by just concatenating the results of the two queries.  So

select n
from orderTest
UNION ALL
select n
from orderTest

Gives 2, 1, 3, 1, 2, 1, 3, 1.  Essentially, the contents of the table repeated twice.  But union makes the results distinct and so it changes the order.  Redoing that with union  instead of union all gives 1, 2, 3.

Indexes will also affect the order.  You may notice the tests were all done on a heap with no indexes.  But if I add an index (either clustered or nonclustered) it makes the order for “select n from orderTest” become 1, 1, 2, 3.  Of course, you cannot always rely on the data being returned in index order either, unless there is an order by clause.  Similarly, as Conor Cunningham pointed out, the size of the dataset can affect it and break apparent order as the optimizer seeks to increase parallelization.

In short, SQL Server has certain ways it will normally return data and it can be helpful to be aware of those.  Even small changes to the query or changes in the size of the data can have a major impact on the order the results are returned.  Without an order by clause, SQL is not guaranteed to return the data in the way it normally does and you cannot rely on that order.  If the order actually matters, use an order by clause.

SQL Server Three Valued Logic and Ansi Nulls

SQL Server uses a three valued logic with True, False, and Unknown. And, normally, SQL Server Nulls are not comparable. Together that means that this little script:

if Null = Null -- Results in Unknown
print 'Num 1'

if Null != Null -- Results in Unknown
print 'Num 2'

if not (Null = Null) or not (Null != Null) -- Results in Unknown
print 'Num 3'

normally produces no results. It simply prints the standard ‘Command(s) completed successfully.’ One significant implication of this is that if the where clause includes something like

[column_name] = null

It will filter out all rows. The best way to look for Nulls is to use

[column_name] is null

But, whether or not Nulls are comparable can be controlled by the use of the Ansi_Nulls session setting. The default for Ansi_Nulls is on, and in fact the documentation warns that ANSI_NULLS may always be ON, with no way to turn it off, in future versions. Until then, this test script:

Set Ansi_nulls off

if Null = Null -- Now Results in True
print 'Num 1'

if Null != Null -- Now False
print 'Num 2'

if not (Null = Null) or not (Null != Null) -- Now True
print 'Num 3'

Dutifully prints out:

Num 1
Num 3

It is generally best to leave Ansi_nulls in its standards compliant default setting and use “is null” or “is not null”, but occasionally it can be useful to change that.

[20 Aug 13 – Updated to add tags and change category]