Surveys covering specific groups of individuals often confront two types of obstacle: those inherent to the population concerned—limited number of individuals, problems reaching them, etc.—which make it difficult to constitute a representative sample, and those linked to the intimate or confidential nature of the subject covered. The survey of HIV-positive persons presented here faces both types of difficulty which, from a practical viewpoint, are often interlinked. However, the questions raised are handled with particular care by the authors to guarantee the scientific quality of the results obtained. Surveys of this kind are a vital tool for understanding the effects of HIV infection and treatment on the lives of HIV-positive persons, now that therapeutic advances have greatly increased their life expectancy.


How can a survey be conducted without a sampling frame? For demographers and epidemiologists alike, some subpopulations, defined by a health problem or a specific situation, are difficult to study, because there is no sampling frame for conducting a survey. In some cases, at the cost of considerable effort, the census can serve as a starting point. For example, 360,000 persons were interviewed for the “Vie quotidienne et santé” (Daily Life and Health, VQS) survey conducted alongside the 1999 census. On the basis of this very large sample, it was possible to select the persons eligible to participate in the “Handicaps, incapacités, dépendance” (Disability, Impairments and Dependence, HID) survey and 16,900 respondents were interviewed (Mormiche, 2000). In other cases, administrative files may provide access to the targeted population: for example, the survey launched in late 2004 of cancer patients still alive two years after diagnosis of the disease was based on the health insurance files of patients with long-term illnesses.


For other subpopulations, the most reasonable method is to contact respondents via facilities that provide them with specific services. For example, the various surveys conducted among homeless persons in France used shelters and distribution sites for food or hot meals as sampling frames, along with day centres or other mobile services (Firdion et al., 1998; Firdion et al., 2001; Brousse et al., 2002). It is even possible, if different types of service or site are used, to attempt to enumerate the subpopulation in question by identifying duplications in the sample (capture-recapture method): this technique was used to estimate the number of drug users at local level (Chevallier, 2001) [1]  By contacting persons in syringe exchange programmes,...[1]. For this type of survey, special attention must be paid to the definition of the target population, to the hidden population (persons who do not visit the sites used as the sample frame [2]  In some cases, alongside the target and hidden populations,...[2]), to the unequal individual probabilities of being included in the sample (due to varying frequency of visits to the sampling sites), but also to the numerous potential sources of data collection bias, for example because the interview does not take place in the respondent’s home [3]  In general, a survey conducted inside a facility used...[3], or because an intermediary is required to establish contact.

The HIV-positive population: a subpopulation that is difficult to reach


Introduced in 1996, combination antiretroviral therapy (ART) has transformed the lives of persons infected with the HIV virus, at least in developed countries where these treatments are available and widely accessible. This treatment does not provide a cure, but it keeps the disease at bay and considerably increases the life expectancy of HIV-positive persons. In France, the number of deaths due to AIDS has considerably decreased in recent years (Nizard, 2000; Delfraissy, 2004). HIV infection has become a chronic disease, around which infected persons must learn to (re)build their lives.


To ensure adequate and effective medical care for patients, it has become crucial to study the living conditions of HIV-positive persons: the physical and psychological consequences of the infection on their daily lives, their social and labour force integration, their emotional and sex lives and their plans for the future. Existing surveys, whether cohort follow-up studies or cross-section surveys, do not explore all the different implications of the “chronicization” of AIDS, either because they are too old (Schwoebel, 1997), or because they are limited to specific groups such as persons infected through drug use (Moatti et al. 2000), or yet again because they are centred on bioclinical aspects (Meyer et al. 2003) or on a particular theme (ART monitoring, see Spire et al. 2002; “needs” surveys of the hospital administration, INVS, 2001). Moreover, in France, the dynamics of the HIV-positive population over time have varied according to the origin of infection (male homosexuals and bisexuals, intravenous drug users, migrants from sub-Saharan Africa, general heterosexual population), and from one region to another.


To provide a complete and up-to-date picture of the situation and of the needs of HIV-positive persons, the Agence nationale de recherches sur le sida (French AIDS research agency, ANRS) financed a large national survey of HIV-infected persons called VESPA. More precisely, the aim was to describe the living conditions and the social situation of HIV-positive persons, in a context where treatments that postpone clinical symptoms and increase life expectancy are available and accessible. The survey targeted such aspects as the accessibility of health care and insurance, the impact of the infection and of treatment, labour force integration, financial resources and living conditions, social life, conjugal, emotional and sex life, parenthood, and discrimination occurring in any of these realms. This focus meant that the respondents needed to be at least 18 years old, to have been aware of their HIV status for some time and to have lived in France long enough to be covered by the French social protection system.


HIV status being a sensitive issue and protected by medical secrecy, there are no official files that can serve as a sampling frame. Various methods were considered, and the method finally chosen was to interview hospital outpatients consulting for treatment of an HIV infection. The July 2004 special issue of AIDS Care, devoted to the 6th international AIDS Impact conference shows the diversity and the limits of the techniques generally used in France and abroad to interview HIV-positive persons (AIDS Care, 2004). In the surveys presented at the conference, the respondents were of three types: volunteers recruited through associations, persons contacted using procedures to target specific populations known to have a high prevalence of HIV (drug addicts, prostitutes, etc.), or health care users. In the latter case, patients are usually recruited within a single hospital or a limited number of hospitals (multicentric studies). Moreover, these patients often either volunteer or are chosen by the physicians, so it is not always possible to calculate a response rate, let alone to characterize the non-respondents. The VESPA project aimed to overcome these constraints.


In order to explore all aspects of the “chronicization” of the disease, we decided to combine a face-to-face interview lasting about 40 minutes and conducted by a professional interviewer [4]  With immediate data capture using the CAPI (Computer-Assisted...[4] with a second questionnaire filled in by the respondent him/herself, and mainly comprising psychometric scales. This second questionnaire was designed to be completed in about 20 minutes, not counting the medical questionnaire filled in by the physician.

Potential biases in a hospital survey of HIV-infected persons


Surveys conducted among inpatients or outpatients, including HIV patients, are increasingly common on all continents (Miller et al. 2003; Burgoyne and Renwick, 2004). Recruiting respondents in a hospital of course makes it easier to find the target population. However, it does pose some methodological problems, of the kind already mentioned above. First, there is a coverage problem: limiting the survey to HIV-positive hospital outpatients leaves out all those who, though they are aware of their HIV status, refuse any form of health care, those who are treated elsewhere, and those who are hospitalized.


In France, 93% of patients treated for HIV are cared for in hospitals (Bourdillon et al., 1996; Nadal et al., 1997) [5]  Though a large proportion of HIV-positive patients...[5]. This overwhelming preference can probably be explained by the rapid changes in therapeutic strategies and by the fact that some antiretroviral drugs can only be prescribed and dispensed in hospitals. In addition, since the introduction of combination therapies and the resulting decline in the frequency of clinical symptoms, hospital care has changed considerably, with a sharp decrease in complete, inpatient hospitalizations [6]  A distinction is made between outpatient consultations,...[6] and a rise in outpatient consultations (ministère de l’Emploi et de la Solidarité, 2000). The findings of the HIV-Plan survey (Obadia et al., 2002) also show that the sociodemographic and epidemiological profile of patients treated in day hospitals is very similar to that of persons attending outpatient consultations. In addition, patients hospitalized overnight are usually also treated on an out-patient basis, whereas those who do not consult at all are usually unable to answer a questionnaire, due to their poor state of health or to cognitive disorders. The HIV-Plan survey shows that half of this population was diagnosed only because of an opportunistic infection leading to hospitalization: in this respect they did not “experience” the effects of being HIV-positive and are thus outside the scope of the VESPA survey. For these reasons, we decided to survey only persons using outpatient services.


Though the coverage problem is minor, there are other difficulties which cannot be ignored. First, the diagnosis of HIV infection is covered by medical secrecy and all survey teams must first pass through the physicians, who are the sole holders of this information, before approaching the patients, informing them and obtaining their written consent as required by the rules of medical research (CCNE, 1998). This confidentiality rule is even more stringent for HIV patients due to fears of stigmatization and discrimination. Under this rule, physicians must be the first to ask patients whether they wish to participate in the study. They may also choose not to ask a patient if they do not believe that he or she is capable of responding [7]  The procedure for recruiting patients was approved...[7].


Furthermore, patients who consult at the hospital tend to have less time than persons who are contacted at home by telephone for a face-to-face or phone interview, often after having received a letter describing the survey, and with the possibility of making an appointment at their convenience. The former are less likely to be willing to spend time on a survey than the latter, especially if they have work or family obligations, or must make a long trip home. A hospital survey thus introduces a specific selection bias since the patients can only be reached through the physicians and because some patients cannot spare the time at that moment [8]  To overcome this problem of availability, the possibility...[8].


Of course, if the questionnaire is very long, as was the case for the VESPA survey, the biases linked to patient availability are greater. This is probably a common difficulty for biomedical studies that adopt a sociobehavioural approach, since surveys of this kind, generally based on rather long questionnaires, must also include standardized scales aiming to measure the various aspects of respondents’ state of health. In addition, in VESPA, the very fact of administering a questionnaire to be filled in by the respondents themselves introduced an additional bias linked to their reading and writing skills. In short, the use of a self-administered questionnaire to complete the patients’ history is liable to reinforce the availability bias while introducing a new one.


The VESPA 2003 survey provided an opportunity to measure these different biases with a view to testing the pertinence and limits of the collected data and to contribute to the study of methodological problems raised by hospital surveys. This article examines the representativeness of the survey and studies the impact of these biases. The first part briefly describes the construction of the sampling frame and of the sample, along with the organization of survey implementation, and presents the means at our disposal to study selection bias. The second part compares patients present at the time of survey according to their degree of success in passing through the filters that led to inclusion in the sample (random selection, physician’s invitation, understanding of French, agreement to take part). The third part focuses on the participants’ characteristics, depending on whether or not they agreed to fill in the self-administered questionnaire at the end of the interview.

I - Methodology and data collection

1 - Sample construction

Eligibility criteria


Since the aim of the survey was to study the experience of persons infected with HIV, we decided not to include persons who had known their HIV status for less than six months. In addition, since the occupational and social life of the patients and their use of different social services were among the main focuses of VESPA 2003, the survey did not cover persons under 18, non-resident foreigners or foreigners who had moved to France less than six months previously (for these persons, the questionnaire, based on the French social protection system, would not have been relevant). Last, to ensure that the main face-to-face interview could be conducted without difficulty, the patients were required to understand French, though it was important to avoid excluding persons for this reason if possible because foreign residents are increasingly numerous among persons receiving HIV treatment in France (Cazein et al., 2000) and as such they had to be adequately represented in the sample. To encourage their participation, the questions and possible responses were formulated to avoid comprehension problems for people whose knowledge of French is limited.

Constructing the sampling frame


To build the sampling frame, we first listed the hospitals which treated HIV-infected patients then estimated, for each of them, the number of HIV patients who had come in over the year (annual patient throughput). Two sources of information were available: first, the “compulsory declarations” (DO – Déclaration obligatoire) of AIDS cases, indicating for each hospital the number of persons having reached the stage of clinical AIDS during the year (CDC, 1992) and representing only a fraction of the HIV-infected persons cared for at the hospital at a given time (De Peretti et al., 2001); second, the medical, epidemiological and economic file on human immunodeficiency (DMI2), which provides information on patient throughputs but only for hospitals that are linked to a Centre d’information et de soins de l’immunodéficience humaine (CISIH, centre for information and care of human immunodeficiency). By crossing these two sources, and by adding the half-yearly survey of the Direction des Hôpitaux (central hospital administration) (June 2000 [9]  See Massari et al., 1997, for a description of this...[9]), a list of 608 establishments caring for HIV-infected persons was drawn up. The DOs were available for each of these establishments and for 69 of them the DMI2 provided information on patient throughput. Assuming that AIDS cases represent a stable fraction of these patients, we could calculate the ratio DMI2/DO when both figures were available, then multiply by this ratio the DOs of each hospital for which the patient throughput was unknown to estimate the missing patient throughput. Overall, the national patient throughput was evaluated at 58,240 patients. The exhaustiveness of the DOs having been evaluated at 83.6% (Bernillon et al., 1997), this leads to a final estimate of about 68,000 patients.

Constructing the sample


The approximate sample size was fixed at about 3,000 patients. Judging from previous surveys, this size would ensure enough interviews to obtain information about the living conditions of sub-groups, even minority ones (women, foreigners), while limiting the collection period to about two weeks in each hospital in order to limit costs and avoid disrupting the hospital’s daily routine for too long (Obadia et al., 2002). To accurately represent the distribution of the disease over the French territory, a random selection of hospitals was made using a stratification based on two criteria: the region and the size of the estimated annual patient throughput (in three levels [60–150], [150–300], > 300 patients). The hospitals whose estimated annual throughput was under 60 patients were not included in the random selection. This reduced the number of hospitals to be surveyed to 143, representing 90% of the national patient throughput. Overall, 87 establishments (102 hospital departments) were randomly selected [10]  If a director did not wish his/her hospital to participate...[10]. The number of patients to be surveyed in one establishment was determined on a pro rata basis according to its share in the total throughput of the region; this number was then distributed among departments and physicians proportionally to the number of half-days per week devoted to HIV consultations. All the physicians of a given department who treated HIV patients were asked to participate. This is because in the case of AIDS, a long-term illness affecting a very diverse population, some physicians, may—through experience or preference—have a greater number of patients belonging to a particular group, while the patients themselves may have a preference for a particular physician, upon whom their future survival depends [11]  If a doctor refused to participate in the survey, the...[11].

2 - Organization of data collection


The data collection phase was subject to various constraints: the survey had to be conducted over a short period, in around one hundred hospital departments of metropolitan France, with minimum disturbance to hospital routine and paying close attention to the problem of confidentiality and the need to avoid behaviour that might threaten secrecy, notably in “mixed” departments also receiving non-HIV patients. As the patients were interviewed on the same day as their hospital appointment, without previous warning, the interview had to be kept as short as possible and begin directly after the patient had left the consulting room.


To address these constraints, a system was set up to organize the roles of the different persons involved with the patients, whether or not they belonged to the department, and the various documents needed for the survey: within the department, the main survey intermediaries were the correspondent (usually a nurse [12]  Usually appointed by the head of the department. Given...[12]) and the consulting physicians; outside the department, the intermediaries were the Attachés de recherche clinique (ARC – clinical research attachés) of a company that specializes in the management of clinical research in hospitals, and the professional interviewers.


Depending on the schedule of appointments and the number of eligible patients, the ARC and the correspondent decided on which half-days the survey would take place. The correspondent would then complete an information register for each eligible patient with an anonymous identifier and a few basic facts: sex, age group, mode of HIV transmission, occupational status, viral load and CD4 count [13]  The viral load quantifies the presence of the virus...[13]. The patients were then chosen in the following way: at the beginning of each half day of appointments, the physician would ask every eligible patient, until one agreed to be interviewed. The physician would then wait until the interviewer was free before asking the next patient. The sampling rate among eligible patients was thus determined by the interviewer’s availability, which itself depended on the duration, necessarily variable, of each interview. This procedure was implemented to avoid introducing selection bias by allowing the physicians to select the patients themselves.


The anonymous information register was then transmitted to the ARCs and the data entered by the survey agency. For each eligible patient, it was thus possible to know whether he/she had been selected, whether he/she had been asked to participate or whether the physician considered that her/his physical or mental state was too poor to answer a long questionnaire (“physician’s refusal”), then, if the patient had been asked to participate, whether he/she was excluded because of poor French language skills (“language problem”), whether he/she refused to participate (“patient’s refusal”), or if he/she accepted. This register thus provides important data for identifying possible bias in the survey.


If the patient refused, the physician would ask him/her to fill in a short refusal form [14]  In practice, many doctors filled in the refusal form...[14] which provided a second source of information on patients who refused to participate in the survey. Patients willing to participate were introduced by the physician to the correspondent who would take them to the interviewer to answer the CAPI questionnaire in a private (and discreet) room. After completing the questionnaire, the interviewer gave the patient a 15-euro voucher to thank them for their time [15]  The respondents were told they would receive this amount...[15], and asked them to answer an additional self-administered questionnaire, to be completed and returned to the correspondent.

II - Selecting respondents from among patients present

1 - Initial results


Overall, the survey lasted from December 2002 to September 2003. During that time, 7,904 eligible patients [16]  Or at least theoretically eligible, since “true” eligibility...[16] were present during the half-days when the survey was conducted. Figure 1 illustrates the different filters leading to the selection of the 2,932 patients that were actually interviewed: 5,080 were first randomly selected among those eligible; of these, 264 were not considered capable of responding by the physicians, while 117 patients turned out not to speak French well enough to go through with the interview; 1,767 refused to participate. Overall, the participation rate can be estimated at 2,932/(5,080-117) or 59% [17]  In fact, 5,103 patients were randomly selected; among...[17]. The main interview with CAPI questionnaire lasted an average of 46 minutes. After this interview, 2,410 patients (or 82% of the participants) agreed to fill in the self-administered questionnaire.

Different stages in sample selection for the ANRS VESPA 2003 survey

As mentioned in the introduction, it is important in this type of survey to watch for unequal probabilities of inclusion in the sample due to higher or lower frequency of visits to the sites where sampling took place. In the present case, the frequency of consultations turned out to be relatively homogeneous, since nearly half of the sample had consulted every three months over the last twelve months [18]  Over the previous 12 months, 1.3% consulted only once,...[18]. The sample was weighted by the reciprocal of the annual number of consultations, though this only marginally modifies the results. Given that the aim is to assess possible biases and not to describe the HIV-positive population, this weighting was not applied to the analyses which follow.


Comparing the first and last columns of Table 1, we see that, in terms of the data collected in the information register, the participants’ profile was very similar to that of the present eligible patients. In absolute values, the mean difference between initial percentages (first column) and final percentages (last column) is 1.3 points. More precisely, there are overall slightly more men (the proportion rose from 70.4% of present eligible patients to 72.9% of participants), a few more jobless patients (from 37.8% to 40.5%), also a few more patients with a low viral load or a higher CD4 count, but above all more patients who were infected through homosexual transmission. These patients represent 36.5% of present eligible patients and 41.1% of the participants, and this 4.6 point difference was by far the highest observed. Of course, some of these differences are linked: sex and transmission group, employment and transmission group, viral load and CD4 count.

Table 1 –  Sociodemographic and epidemiological characteristics of patients (distribution in %): from present eligible patients to participating patients

2 - The first potential bias: sample drawing


The first potential selection bias, which we will examine briefly here, concerns the patient selection method. If we compare the first two columns of Table 1 (eligible patients and randomly selected patients), the differences are very small: in absolute terms, the difference is below 0.4 points on average. The highest differences pertain to occupational status: there are 37.8% unemployed among eligible patients versus 38.9% among the randomly selected patients. However, this difference remains small, and even though an χ2 test indicates that it is statistically significant (at the threshold p = 0.01), this degree of significance is almost inevitable due to the large numbers involved.

3 - A second source of bias: the physician


Due to the very small number of “physician’s refusals”, the second and third columns of Table 1 are very similar. This does not mean that the patients who were not solicited by their physician do not have a specific profile. The first column of Table 2 shows which characteristics found in the information register distinguish these patients from those who were randomly selected. Since the physicians excluded patients whose physical and mental state was not good enough, it is not surprising that the “physician’s refusals” are significantly linked to variables that we know to be strongly correlated with the physical and mental health of patients.

Table 2 –  Factors linked respectively to physicians’ refusals, language problems and patients’ refusals (logistic regressions using a forward stepwise selection procedure)

First of all, the risk that a physician will not solicit a patient increases with the viral load and decreases when the CD4 count increases. Since these two medical indicators are strongly correlated with the physical and mental state of HIV-positive persons (Obadia et al., 2002), this result was expected: the physicians appear to have excluded patients in the poorest state of health and these patients above all. Then, compared with patients infected through homosexual transmission, the risk of a “physician’s refusal” is higher among those who were infected through intravenous drug use (IVDU) and to a lesser extent among those infected through heterosexual transmission. One may assume that even controlling for the CD4 count and the viral load, the two latter groups are generally in a poorer state of health, for various reasons: they are more often ill due to the hepatitis C virus (for the IVDU group) [19]  Among the 2,932 participants, nearly 9 out of 10 patients...[19] or suffer from psychiatric or cognitive disorders; they are more often diagnosed late (for the heterosexual contamination group, which includes more foreigners); last, they are more frequently in a situation of social and economic insecurity, and hence more exposed to health risks. In addition, among this population, poor health often makes it impossible to work (Dray-Spira et al., 2003), which may explain why physicians tended to avoid soliciting unemployed patients.


However, we cannot exclude the possibility that physicians may have refrained from soliciting certain patients for other reasons. Indeed, if the physician feels ill-at-ease with a patient, if communication is problematic, or the relationship tense, it is probably more difficult to ask: and this doubtless happens more often with the IVDU group, reputedly more “difficult”, or with foreign patients who are mostly found in the heterosexual transmission group [20]  Among the 2,932 participants, foreigners are over-represented...[20]. Moreover, the supposed link between state of health and employment does not explain the fact that the highest estimated odds ratio is linked to missing data: compared with a patient who the medical staff knows to be in employment, a patient whose occupational status is unknown has a higher chance of not being recruited (OR = 2.78). Such a lack of knowledge on the part of the physician and the correspondent might be the sign of a difficult or still fragile therapeutic relationship.

4 - A third source of bias: the language barrier


Like the “physician’s refusal”, the “language problem” remained a rather exceptional occurrence (n = 117) despite the high proportion of foreigners in the sample. Thus, the third and fourth columns of Table 1 are practically identical, the difference between percentages being always less than one point. If there is a bias here, it is very weak. The second column of Table 2 describes the characteristics of the patients who were excluded due to language problems. All other things being equal, this situation occurs much more frequently among patients aged under 30, among unemployed patients or those whose occupational status is unknown, among those who have a low CD4 count, but even more so among those infected through heterosexual transmission (with an odds ratio reaching 5.10), or belonging to the miscellaneous transmission group “other” [21]  This group includes non-responses and rare cases: blood...[21].


The language problem is strongly correlated with nationality. The analysis of data collected from the 2,932 participants shows that the patients of foreign nationality share certain characteristics with patients excluded due to “language problems”. They are more numerous among persons aged under 30, among the unemployed or those whose occupational status is unknown and they are more often in the heterosexual transmission group (and to a lesser extent in the “others” group).


However, this analysis of foreigners who agreed to participate in the survey shows that the foreign HIV-positive population is more often female, and often has a better immune status (i.e. a higher CD4 count). But in Table 2, sex is not related to “language problems”, and though there is a link with the CD4 count, it is in the “wrong” direction. One might object here that the other variables introduced into the model capture the unobserved impact of nationality: this is quite probable, because if we look at the raw data, women are over-represented among the 117 persons with language problems (there are 56 women, almost half). On the other hand, further scrutiny confirms that this subpopulation does have a lower CD4 count. It is possible that there is a greater concentration of recently arrived foreigners in this category (even though one of the eligibility criteria was residency in France for at least six months), who are more likely to have problems understanding the language and to be in a poorer state of health [22]  In addition, the reasons for refusal noted by the interviewers...[22].

5 - A fourth source of bias: the patient’s refusal


Refusal is by far the most frequent reason for non-participation (n = 1,767), and Table 1 shows that it induces stronger distortions in the sociodemographic and epidemiological structure of the sample than those observed previously. In particular, the proportions of unemployed persons and of the homosexual transmission group increased by 3.1 and 2.7 points respectively. A forward stepwise logistic regression isolates three significant factors in the “patients’ refusals”: the fact of having a job (versus not having one); the fact of belonging to a transmission group other than the homosexual group, and last, a low CD4 count.


The greater tendency to respond among homosexuals is probably due to the historical link between AIDS and the homosexual community, with a high degree of mobilization on the part of associations and a great deal of militancy (Pollack, 1988; Thiaudière, 2002). In addition, these patients very often come from more privileged social backgrounds so they are more willing and able to speak about their disease. Indeed, an examination of the interviewers’ remarks in the last part of the questionnaire concerning the interview and how it went shows that the participants who belonged to the homosexual transmission group were more often willing to participate, less often wanted to stop the interview before the end and were more frequently perceived as relaxed by the interviewers.


Physicians or correspondents also noted on the information register the reasons given by the patients for refusing to participate. Overall, out of the 1,767 patients who refused, 1,702 gave a reason. Three out of four patients said it was due to lack of time. When more details were provided, the first cause given was work-related (“I have to go to work”), followed by family reasons (“I have to pick up the children”) or medical (“I have other tests to do”). Other than lack of time, one in ten patients mentioned a poor state of physical or mental health [23]  Reasons other than lack of time and state of health...[23]. These reasons correspond to the results of the logistic model: persons with a job usually gave work-related reasons for refusing to participate; to a lesser extent, a poor state of health, indicated in the model by a low CD4 count, also led to refusal.

6 - Gender-related reasons for refusal?


Of course, lack of time is one of the reasons most frequently given for refusing to participate in any survey, whether face-to-face or over the phone. This reason may in fact conceal a lack of interest or distrust towards the survey and opinion polls in general—which has little to do with the length of the questionnaire—rather than a real time factor. One would expect this lack of interest or distrust to be found more frequently among patients who also refused to fill in the refusal questionnaire, and less often among those who accepted. In fact, 1,143 out of the 1, 767 who refused to participate (two-thirds) agreed to fill in the very brief refusal questionnaire (one page) given to them by the physician. These questionnaires provide some information on what distinguishes those who took part in the survey from those who accepted the one-page questionnaire but refused the 45-minute interview. It was thus possible to study the bias induced by the choice between a lengthy face-to-face interview and a much shorter questionnaire, either self-administered or filled in with the help of the physician.


Since people may have both work and family obligations, and since the sharing of these obligations is sexually differentiated within a couple, we studied men and women separately. We estimated models in which occupational status and the fact of having dependent children, as well as the interaction between these two constraining situations, might influence participation in the survey when participants are compared with non-participants who agreed to fill in the refusal form. However, since for reasons of confidentiality the refusal form was not paired with the information register, the data from the register were not available and we were obliged to make do with the scanty sociodemographic data provided in the refusal questionnaire (see Table 3).

Table 3 –  Factors linked to participation in the survey, among participants and non-participants who agreed to fill in the refusal questionnaire (logistic regression using a forward stepwise selection procedure)

The factors significantly associated with participation are sexually differentiated, and they are less numerous for women, partly due to the fact that their number is smaller. For women, the likelihood of participating in the survey is reduced due to the combined factors of having a job and taking care of children, and not due to each factor separately, as is the case for men. For men, the likelihood of refusing to participate is higher among Frenchmen, among patients aged under 30 and among those who are less educated. Conversely, in the female sub-sample, it is the most highly educated women who are less likely to participate. It is possible that men who have a job but are not highly educated have more constraints in terms of work schedules, whereas for women on the other hand, greater work constraints are associated with a higher educational level.

III - From the face-to-face interview to the self-administered questionnaire

1 - Which variables are pertinent for analysing participation in the self-administered questionnaire?


After the main CAPI questionnaire, the interviewer gave patients a voucher worth 15 euros to thank them for their participation, and asked them to fill in a questionnaire completing the interview, which was then handed to the correspondent. The time needed to fill it in was estimated at around 20 minutes. Of the 2,932 participants, 522 (or 17.8%) refused to answer the self-administered questionnaire, in which case the interviewers were instructed to enter the reason via CAPI. The reason most often given was lack of time, without any further detail (n = 243), work obligations (n = 49), family obligations (n = 29), or another medical appointment (n = 25). Afterwards came problems such as reading French (n = 73), being tired (n = 54), eyesight problems (n = 28), and last, the presence of an accompanying person (n = 6). These reasons are consistent with the observations made by the interviewers, who generally observed that during the face-to-face interview the persons who later refused the self-administered questionnaire had seemed in a hurry, wished to stop the interview, or were tired.


So the self-administered questionnaire might in fact reinforce the previously identified biases concerning patient participation. Those with family or work obligations were more likely to refuse the additional questionnaire, which increased their under-representation in the sample, while persons with language problems, in particular foreigners, but also those with a lower educational level, were also more likely to refuse. In addition, after an already long interview, it is possible that the patients in poor health refused the self-administered questionnaire because they felt tired. However, the very purpose of the second questionnaire was to evaluate the state of health of HIV-positive patients. This bias causes more concern than the previous ones, because in this case the non-response may be elicited precisely by the situation we wish to measure (state of health).


The participation biases of the self-administered questionnaire can be examined more precisely using the information collected during the face-to-face interview. Alongside the control variables (sex, age, transmission group), we examined the effects of the following variables: nationality and educational level (for problems linked to the comprehension of written French), the presence of a child of 16 or less in the household, occupational status (distinguishing blue- and white-collar workers, more likely to have a stricter work schedule) and the fact of commuting for at least one hour between home and work (the three latter variables serve to determine family and work constraints). For the state of health, the following variables were used: daily intake of tranquillizers, daily intake of antidepressants, immune and viral status, signs of alcohol dependence (CAGE test, Mayfield et al., 1974), hepatitis C infection, side effects of treatment and lastly, a suicide attempt in the last 12 months.


The duration of the interview was in itself likely to influence participation in the self-administered questionnaire, but in two different ways: first, a face-to-face interview that lasted too long might cause patients to leave before the end because they were running late for other activities following their hospital appointment; conversely, if the interview was very brief, this might signify that the respondent was in a hurry and wished to cut short their participation as soon as the interview was over. This is why the interview duration in minutes, as well as its squared value, were included in the model, since the squared term makes it possible to identify non-linear effects with a local optimum.

2 - Factors linked to participation in the self-administered questionnaire


For some of the variables put forward to explain participation in the self-administered questionnaire, no significant relationship can be identified, even on the basis of a bivariate analysis: this is the case for several indicators pertaining to health status (immune and viral status, treatment side effects, hepatitis C infection, daily use of antidepressants—only just—, signs of alcohol dependency), but also for one hour or more of commuting between home and work (see Table 4), and finally for age.

Table 4 –  Factors linked to having filled in the self-administered questionnaire proposed after the face-to-face questionnaire (logistic regression using a forward stepwise selection procedure)

Other relationships are significant only in the bivariate analysis: the positive effect tied to the daily use of antidepressants disappears when nationality is taken into account (in other words, French citizens more often take antidepressant medication and are more often willing to answer the self-administered questionnaire, but there is no causal link between these two behaviours); the impact of gender and of having a child of 16 or less living at home disappears when the transmission group is taken into account; and last, the effect of occupational status diminishes when the model includes educational level, and disappears when the transmission group and suicide attempts are introduced (people with the lowest educational level and the IVDU group, who answer the self administered questionnaire less often, are more often jobless or blue-collar workers, whereas patients who had made a suicide attempt during the year, and who are also less willing to participate, are for the most part unemployed).


Overall, the patients most likely to fill in the self-administered questionnaire were French citizens, highly educated, infected through homosexual transmission and had not attempted suicide during the past year. Once again, we find a greater propensity to respond among homosexuals, and the bias introduced by the use of written language is confirmed, along with the impact of nationality and educational level. On the other hand, only one indicator pertaining to mental health (suicide attempts) is significantly linked to participation, according to the model, implying that it is psychological difficulties rather than physical problems that might have induced bias.


Of course, it is possible that poor physical health may have played a role in the refusal to fill in the self-administered questionnaire, and that this fact was not registered due to a poor choice of variables to assess the respondents’ state of health. However, among the patients who did answer the second questionnaire, and for whom we could calculate an overall physical health score and an overall mental health score (on the basis of the SF36 standardized scales, see Leplege et al., 2001), the previously selected variables turned out to be very good indicators of the state of health: in particular, the perceived side effects of the treatment and co-infection by HCV are very strongly correlated with physical and mental health scores, while suicide attempts over the last twelve months are also strongly correlated with the mental health score.


Last, we note that the observed relationship between the time taken to fill in the face-to-face questionnaire and the willingness to participate in the self-administered questionnaire is not linear but parabolic: the probability of accepting the second questionnaire is highest when patients spend about 75 minutes for the interview, and it decreases for both longer and shorter durations. The shorter the interview, the more likely that the patient is in a hurry and will refuse the self-administered questionnaire; the longer the CAPI interview, the more likely that the patient will have no time left for the questionnaire. This second observation suggests that if the first questionnaire had been shorter, the rate of response to the second questionnaire might have been higher.



Surveys conducted in hospitals, aimed at sub-populations which cannot be reached by more conventional survey procedures, often involve the use of long and/or multiple questionnaires based on a series of standardized scales measuring a given aspect of the respondents’ health. If surveys of this type are to become more frequent in coming years, then it is important to study the many sources of potential bias. The VESPA 2000 survey made it possible to conduct such a study, thanks mainly to the creation of an anonymous information register completed by a hospital correspondent.


Overall, the biases revealed are significant, both because they reach a certain statistical threshold and because they are interpretable, even though distortions in the interviewed sample are finally quite limited, at least as concerns the patients who answered the main questionnaire. As regards the self-administered questionnaire proposed after the interview, which focused on aspects of the health of HIV-positive patients, it seems that physical health did not influence participation, though for mental health the opposite was true.


The procedure chosen for random selection, simple to implement and designed to fit in with hospital routine, seems to have worked very well, both because the eligible population and the randomly selected population had almost identical characteristics, and also because “physician’s refusal” cases remained rare and were generally justified (poor state of physical or mental health). Note that this reduces the participation rate: since physicians were never—or rarely—able to solicit only supposedly “good” patients and to exclude difficult patients, “patient refusals” were more frequent.


Last, among non-participation factors, those related to the patient carry much more weight than those linked to the way the survey was organized [24]  The language barrier was considered in this case to...[24]. In addition to the greater propensity to respond among persons infected through homosexual transmission, probably tied to their greater ability and willingness to participate, we also note the impact of occupational and family constraints, whose impact is probably different for men and women and varies according to the type of employment. Likewise, the spoken or written language barrier puts foreigners at a particular disadvantage, especially for written questions.


In order to overcome such difficulties in the future without reducing the quantity of information collected, we will need to use tools designed for multilingual surveys (audio computer-assisted self-interviewing: audio CASI, cf. Rogers et al., 1999), or other alternative procedures which take up less of the patient’s time (such as the distribution of self-administered questionnaires which the patients fill in at home rather than at the hospital and send back by mail, cf. Florence et al., 2004).


